Part Number Hot Search : 
10350 2SD2105 27C25 TGH22A SC5094 D60NF55 D60NF55 D5025
Product Description
Full Text Search
 

To Download M5M410092B Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  contentsi overview of 3d-ram and its functional blocks introduction frame buffer design example simplified 3d-ram block diagram 3d-ram functional blocks block, page, and page group dram banks and basic dram operations pixel buffer video buffers global bus pixel alu basics rop/blend units dual compare unit pipelining the picking logic pin descriptions and pinouts common pins pixel alu interface dram control video interface test access port power & ground 3d-ram pinouts tracking label normal pinout diagram reverse pinout diagram pixel alu operations elements of the pixel buffer block and word dirty tag using dirty tag for color expansion plane mask
3 25 26 40 44 51 52 55 56 57 58 58 58 58 58 59 60 62 62 64 6 5 66 67 70 70 71 72 72 72 72 72 73 73 75 76 77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 ii pixel alu operations elements of the pixel alu rop/blend units stencil modes 16-bit color mode dual compare unit pipelining the picking logic operations of the pixel alu register operations identification register (id[31:0]) plane mask register (pm[31:0]) constant source register (csr[35:0]) match mask register (mtm[31:0]) magnitude mask register (mgm[31:0]) rop/blend control register (rbc[31:0]) compare control register (ccr[31:0]) write address control register (wac[31:0]) an application of the write address control register blend_2 control register (bld2[31:0]) preblend control register (pbc[31:0]) stencil planes register (stpl[31:0]) stencil control register (stc[31:0]) pass_ins select register (pins[31:0]) color depth select register (cds[31:0]) prohibited register access pixel data operations stateless initial data write stateless normal data write stateful initial data write stateful normal data write replace dirty tag or dirty tag dram operations an overview of dram operations description of dram operations unmasked write block (uwb)
iiipixel alu pipelines and dram activities dram and pixel alu interactions dram activities frame buffer organizations introduction 1280 x 1024 x 8 organization 1280 x 1024 x 32 single buffered organization 1280 x 1024 x 32 double buffered organization with z 640 x 512 x 8 double buffered organization with z electrical specifications absolute maximum ratings testing conditions dc specifications ac specifications pixel alu timing parameters dram timing parameters video buffer timing parameters boundary-scan timing parameters timing diagrams timing diagrams dram operations masked write block (mwb) precharge bank (pre) video transfer (vdx) video buffer load video output operation initialize and abort video output prohibited video operation sequence duplicate page (dup) read block (rdb) access page (acp) no operation (nop
ivjtag boundary scan boundary-scan architecture the tap controller test-logic-reset state run-test/idle state select-dr-scan state capture-dr state shift-dr state exit1-dr state pause-dr state exit2-dr state update-dr state select-ir-scan state capture-ir state shift-ir state exit1-ir state pause-ir state exit2-ir state update-ir state test data register bypass register boundary-scan register instruction register bypass instruction sample/preload instruction extest instruction vid_oe boundary-scan cellpackaging 3d-ram pinouts normal pinout diagram reverse pinout diagram tracking label mechanical drawing for 128-pin fp and rf packages thermal characteristics thermal resistance for single package thermal resistance for twelve packages mounted on pcb
vformal specification of operations elements bit ordering of elements access page duplicate page precharge bank read block masked write block unmasked write block video transfer video cycle data read stateless initial data write stateless normal data write replace dirty tag or dirty tag write plane mask register appendix a glossary 167 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

revision history 0

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) vii 0 revision history revision history rev. 0.95 first version of distributed databook rev. 0.96 chapter 2 the ?otal?entry in table 2.2 on pp. 19 was corrected. entries in table 2.6 on pp. 22 were cor- rected. tracking label mnemonic on p. 22 was corrected to show the speed grade ?12? chapter 3 entries in table 3.5 on pp. 39-41were corrected. entries in table 3.5 on pp. 39-41were corrected. note 1 was added to table 3.13 on p. 57. description of rbc[8n+4] on p. 60 was corrected. description of bld2[29:28] on p. 65 was corrected. description of pbc[29:28] on p. 66 was corrected. the mnemonic pbc on p. 66 was cor- rected. note for the color depth select register on p. 71 was corrected. entries in table 3.5 on pp. 39-41were corrected. entries in table 3.5 on pp. 39-41were corrected. entries in table 3.5 on pp. 39-41were corrected. chapter 7 the speed grade ?13?was replaced by the speed grade ?12?in tables 7.4 through 7.9. note that both the ?10?and ?12?speed grades now have the same values for video buffer timing parame- ters. chapter 8 all figure numbers and table numbers were corrected. the speed grade ?13?was replaced by the speed grade ?12?in tables 8.2 through 8.13. note that both the ?10? and ?12?speed grades now have the same values for video buffer timing parameters. chapter 9 the mnemonic for the tracking label was corrected to show the ?12?speed grade. table 9.43 on p. 157 now shows the cor- rect values for the parameters l and i2. chapter 10 ? the paragraph boundary-scan register on p. 169 now shows both bits 1 and 0 of the pass_in pins. figure 10.4 on p. 170 now more correctly reflects the scan chain described on p. 171. rev. 1.00 ? chapter 2 ?tracking label mnemonic on p. 22 was corrected to show the speed grade ? 10a? ? chapter 3 ?wording of note on bit fields on p. 60 cor- rected to clarify meaning. ? chapter 5 ?tables 5.2 on p. 88, 5.4 on p. 90, 5.6 on p. 92, 5.8 on p. 94, 5.10 on p. 96, and 5.12 on p. 98 deleted.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) viii 0 revision history ? chapter 7 ?the speed grade ?10a?was added to all tables. ?entries in table 7.4 on pp. 117 and 118, table 7.5 on p. 119, table 7.6 on p. 120, and table 77 on p. 121 were corrected. ? chapter 8 ?the speed grade ?10a?was added to all tables. ?entries in table 8.3 on p. 131, table 8.5 on p. 135, table 8.6 on p. 135, table 8.7 on p. 137, and table 8.0 on p. 140 were corrected. ? chapter 9 ?tracking label mnemonic on p. 22 was corrected to show the speed grade ? 10a? rev. 1.02 ? chapter 2 ?tracking label and pinout diagrams are updated to reflect the new 5-character manufacturing code. ? chapter 3 ?the stateless mode of the color depth select register in table 3.14 is corrected. ?description of prohibited write control register operation sequence is added. ? chapter 4 ?description of prohibited video transfer operation sequence is added. ? chapter 7 ?the value of i cc in table 7.3 is cor- rected to reflect the improvement of vid_clk cycle time from 14.0 ns down to 12.0 ns. ?minor editorial corrections in table 7.4 are done; no parameter values are changed. ? chapter 8 ?minor editorial corrections in table 8.3 are done; no parameter values are changed. ? chapter 9 ?tracking label and pinout diagrams are updated to reflect the new 5-character manufacturing code. rev. 1.03 ? table of contents is added ? chapter 3 ?figure 3.28 and the corresponding para- graph are corrected. ? chapter 4 ?figure 4.6 and the corresponding para- graph are corrected ? chapter 9 ?the thermal resistance values in tables 9.2 and 9.3 are updated.
overview of 3d-ram and its functional blocks 1

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 1 1 overview of 3d-ram and its functional block s overview of 3d-ram and its functional blocks introduction one of the traditional bottlenecks of 3d graphics hardware has been the rate at which pixels can be rendered into a frame buffer using conventional dram or vram. the 3d-ram emerged from a complete rethinking of frame buffer technology and produces an order of magnitude increase in rendering performance. the essence of the 3d-ram architecture is: (1) an optimized array architecture that minimizes the average memory cycle time when rendering and (2) a selective on-chip logic that converts the interface with the rendering controller from a read-modified-write mode to a write-mostly mode. in addition to the performance boost, the new architecture also significantly reduces the system chip count. in 1994 mitsubishi pioneered the introduction of the first member of the 3d-ram family of products. this databook specifies all the features and operations of the third generation product of the 3d-ram family to further elevate the performance of the 3d-ram based 3d graphics systems. all references to 3d-ram means the product M5M410092B, unless otherwise specifically designated. the factors responsible for the dramatic overall performance improvement include: new memory architecture ? 10-mbits dram array supporting 1280 x 1024 x 8 frame buffer ? four independent, interleaved dram banks ? 2048-bit sram pixel buffer as the cache between dram and alu ? built-in tile-oriented memory addressing for rendering and scan line-oriented memory addressing for video refresh ? 256-bit global bus connecting dram banks and pixel buffer ? flexible dual video buffer supporting 85-hz crt refresh write mostly interface on-chip alu four rop units supporting 16 raster operations on byte data four blend units blending the old pixel value with new information on-chip hardware acceleration for all opengl blending modes (new) on-chip hardware acceleration for all opengl stencil modes (new) one 32-bit match comparator and one 32-bit magnitude comparator concurrent operations of dram, pixel buffer, alu and video buffer 32-bit synchronous high-bandwidth data bus interface with rendering controller blending operations in both (8, 8, 8, 8) and (4, 4, 4, 4) color modes (new) one additional pass_in pin for flexible bit plane organization frame buffer design example figure 1.1 is a simple frame buffer design example showing a 1280 x 1024 x 32 single buffered configuration. the rendering controller writes pixel data across the 128-bit bus to the four 3d-rams. the controller commands most of the 3d-ram operations, including alu functions, pixel buffer addressing, and dram operations. the controller can also command video display by setting up the ramdac and requesting video transfers from 3d-rams. with the 128-bit pixel data bus shown in figure 1.1, four pixels can be moved across the bus in one cycle. there are two ways to organize the 3d-rams: (1) each 3d-ram holds one of the
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 2 1 overview of 3d-ram and its functional blocks 8-bit color components?, g, b, or a?or all 1280 x 1024 pixels; (2) each 3d-ram holds all 32 bits of a pixel value for 320 x 1024 pixels, allowing fast scrolling in the vertical direction and interleaving four 3d-rams in the horizontal direction. if the width of the data bus from the rendering controller to 3d-ram is reduced to 64 bits, then two pixels are transferred in one cycle. similarly, a 32-bit data bus can transfer only one pixel at a time. chapter 6 provides more examples of frame buffer organizations using 3d-rams, such as 1280 x 1024 x 8, 320 x 1024 x 32, etc. figure 1.1 1280 x 1024 x 32 frame buffer consisting of four 3d-rams, shown together with a rendering control- ler and a ramdac 16 3d-ram rendering controller 3d-ram 3d-ram 3d-ram ramdac monitor system interface 32 32 32 32 16 16 16 pixel data address & control video data video data video data video data video control
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 3 1 overview of 3d-ram and its functional block s simplified 3d-ram block diagram the 3d-ram block diagram is shown in figure 1.2. the dram array is partitioned into four independent banks of 2.5 mbits each. together, these four banks can support a screen resolution of 1280 x 1024 x 8. the independent banks can be interleaved to facilitate almost uninterrupted frame buffer update and, at the same time, can transfer pixel data to the dual video buffer for screen refresh. data from the dram banks is transferred over the 256-bit global bus to the triple-ported pixel buffer. the pixel buffer consists of eight blocks, each of which is 256 bits and is updated in a single transfer on the global bus. hence, the memory size of the pixel buffer is 2 kbits. the alu uses two of the pixel buffer ports to read and write data in the same clock cycle. each video buffer is 80 x 8 bits and is loaded in a single dram operation. one video buffer can be loaded while the other is sending out video data. figure 1.2 simpli?d 3d-ram block diagram dram bank alu 256 640 640 640 640 16 32 video pixel global bus video buffer i video buffer ii a (2.5 mbits) pixel buffer (2 kbits) data data 32 32 dram bank b (2.5 mbits) dram bank d (2.5 mbits) dram bank c (2.5 mbits)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 4 1 overview of 3d-ram and its functional blocks 3d-ram functional blocks the 3d-ram has five major functional blocks in: dram banks, video buffers, pixel buffer, global bus, and pixel alu. the following sections provide a quick overview of each of these functional blocks. chapter 3 describes details of the pixel alu operations, chapter 4 presents specifics of the dram operations, and chapter 5 provides examples of parallelism between the pixel alu operations and the dram operations. now, to give readers a better grasp of these functional blocks, we first describe the memory units on which these functional blocks operate. block, page, and page group a word has 32 bits and is the unit of data operations within the pixel alu and between the pixel alu and pixel buffer. when the pixel alu accesses the pixel buffer, not only a block address needs to be specified but also a word has to be identified. since there are eight blocks in the pixel buffer and eight words in a block, the upper three bits of the input pins palu_a designate the block, and the lower three bits select the word. the data in a word is directly mapped to palu_dq [31:0] in corresponding order. that is, bit 0 of the word is mapped to palu_dq0, bit 1 to palu_dq1, and so on. although an alu write operation operates on one word at a time, each of the four bytes in a word may be individually masked. the mapping is also direct and linear: byte 0 is palu_dq [7:0] , byte 1 palu_dq [15:8], byte 2 palu_dq [23:16] , and byte 3 palu_dq [31:24] . a block has 256 bits and is the unit of memory operations between a dram bank and the pixel buffer over the global bus. the input pins dram_a selects a block from the pixel buffer and a block from a page of a dram bank. the dram operations on block data are unmasked write block (uwb), masked write block (mwb), and read block (rdb). these operations are described in detail on page 44, ?escription of dram operations. a page in a dram bank is organized into 10 x 4 blocks. since a block has 256 bits, a page has 10,240 bits. there are four dram banks in a 3d-ram chip, the pages of the same page address from all four dram banks compose a page group. therefore, a page group has 20 x 8 blocks. note in figure 1.3, the block and page are purposely drawn as rectangular shapes. the user may relate these to a tiled frame buffer memory organization. for example, if the display resolution is 1280 x 1024 x 8, then a (32-bit) word contains four pixels. since a block may be viewed as having 2 x 4 words, it contains 8 x 4 pixels. a page is organized into 10 x 4 blocks, so it contains 80 x 16 pixels, and a page group holds 160 x 32 pixels. finally, a screen is composed of 8 x 32 page groups. the advantage of such a frame buffer memory organization is the minimization of page miss penalty. 3d objects frequently occupy portions of multiple scan lines. since in this case a page contains 80 x 16 pixels instead of 10,240 x 1 pixels, page miss is reduced. when an object extends beyond a page boundary, bank interleaving allows hidden precharge and uninterrupted memory access. details of the various frame buffer memory organizations using 3d-rams are discussed in chapter 6. on the other hand, to support screen refresh, the video buffer must output pixel data one scan line at a time. the internal organization of a page also allows data to be transferred from a page to the video buffer, one of the sixteen scan lines of 80 bytes long each at a time. see the section ?ideo buffers?on page 7 for a summary and the section ?ideo transfer (vdx)?on page 46 for full details.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 5 1 overview of 3d-ram and its functional block s figure 1.3 relations and addressing scheme of blocks and words in the pixel buffer and in the dram page 0 12345 67 pixel buffer 0 1 2 3 4 5 67 block 0 7:0 15:8 23:16 31:24 word 0 in block 0 selecting a block in the height dram_a [8:0] selecting a block in the width selecting one of eight blocks 0 1 2 3 4 5 6 7 8 10 blocks 00 04 08 0c 10 14 18 1c 20 24 01 05 09 0d 11 15 19 1d 21 25 02 06 0a 0e 12 16 1a 1e 22 26 03 07 0b 0f 13 17 1b 1f 23 27 a page in a dram bank global bus 256 256 direction from a dram page direction from a dram page in the pixel buffer 0 1 2 3 4 5 palu_a [5..0] selecting one of eight words selecting one of eight blocks from the pixel buffer from the selected block 4 block
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 6 1 overview of 3d-ram and its functional blocks dram banks and basic dram operations the 3d-ram contains four independent dram banks which can be interleaved to facilitate hidden precharge or access in one bank while screen refresh is being performed in another bank. each dram bank has 256 pages with 10,240 bits per page for a total storage of 2,621,440 bits. an additional 257th page can be accessed for special functions or used to hold off-screen data. a row decoder takes 9-bit page address signals to generate 257 word lines, one for each page. the word lines select which page is connected to the sense amplifiers. the sense amplifiers read and write the page selected by the row decoder. because the sense amplifiers retain data after the read/write operations, they function like a direct- mapped level-two pixel cache. (the pixel buffer, which is discussed on page 7, functions as a level-one pixel cache in a frame buffer with 3d-rams.) during an access page (acp) operation, the row decoder selects a page by activating its word line. activating the word line of a particular page transfers the bit charges of that page to the sense amplifiers. the sense amplifiers amplify the charges. after the sensing and amplification are completed, the sense amplifiers are ready to interface the global bus or video buffer. in a way, acp may be viewed as a ?rite cache?operation on the sense amplifiers as a level-two pixel cache. because the activated word line remains connected to the sense amplifiers after the acp operation until the subsequent precharge bank operation, when a block of the sense amplifiers is updated by a block write operation (umb or mwb), the corresponding block in the dram array is also updated. therefore, the sense amplifiers function as a ?rite-through?cache, and no write back to the dram array is necessary. alternatively, the data in the sense amplifiers can be written to any page in the same bank at this time, simply by selecting a word line without first equalizing the sense amplifiers. this function is called duplicate page (dup). a typical application of this function is copying from the 257th page to one of the 256 normal pages?ll 10,240 bits at a time?or fast area fill. when the sense amplifiers in a dram bank completes the read/write operations with the global bus or video buffer, a precharge bank (pre) operation usually follows. a precharge bank cycle simply deactivates the selected word line corresponding to the current page and equalizes the sense amplifiers. the pre operation may be viewed as the close of a page access or as the preparation for the subsequent page access. the dram bank must be precharged prior to accessing a new page. figure 1.4 dram bank consisting of row decoder, address latch, dram array, and sense ampli?rs sense ampli?rs row decoder dram array 10,240 bits/page 257 pages latch
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 7 1 overview of 3d-ram and its functional block s pixel buffer the pixel buffer is a 2048-bit sram organized into eight 256-bit blocks, as seen in figure 1.3, and functions as a level-one write-back pixel cache. it has a 256-bit read/write port, a 32-bit read port, and a 32-bit write port. referring to figure 1.6, the 256-bit read/write port is connected to the global bus via a write buffer, and the two 32-bit ports are connected to the pixel alu and the pixel data pins. all three ports can be used simultaneously as long as the same memory cell is not accessed. if the two 32-bit ports access the same cell, the write operation will be successful but the read data will be undefined. a 1-bit dirty tag bit is assigned to each byte data in the pixel buffer. therefore, each block in the pixel buffer is associated with a 32-bit dirty tag in the dual-port dirty tag ram. when a block is transferred from the sense amplifiers to the pixel buffer through the 256-bit port, the corresponding 32-bit dirty tag is cleared. when a block is transferred from the pixel buffer to a dram bank, the dirty tag determines which bytes are actually written. this feature can save as much as 50% of the power consumed by a 256-bit block write operation without the dirty tag. the cache set associativity is determined external to the 3d-ram, thereby permitting optimal cache design tailored to the particular graphics system. video buffers each video buffer receives 640-bit data at a time from one of the two dram banks connected to it. (the reader is reminded of the 3d-ram block diagram in figure 1.2.) sixteen bits of data are shifted out onto the video data pins every video clock cycle at 14-ns rate. it takes 40 video clocks to shift all data out of a video buffer. the video counter counts modulo 40 and toggles the buffer select line when the count wraps around to 0. these two video buffers can be alternated to provide a seamless stream of video data. figure 1.5 video transfer from a dram page to the video buffer 0 1 2 14 15 80 bytes video buffer (40 x 16 bits) a dram page 0 1 2 3 4 5 6 7 8 dram_a ignored 640 [8..0] other functions 16 bytes 16 video data out selecting one of the sixteen 80-byte scan lines from the page
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 8 1 overview of 3d-ram and its functional blocks global bus the global bus connects the pixel buffer to the sense amplifiers of all four dram banks. the global bus consists of 256 data lines. referring to figure 1.6, during a transfer from the pixel buffer to dram, the 256 bits are conditionally written depending on the 32-bit dirty tag and the 32-bit plane mask. when a data block is transferred from the pixel buffer to the sense amplifiers, the dirty tag and plane mask control which bits of the sense amplifiers are changed via the write buffer. note that all read/write operations are viewed from the perspective of the rendering controller. in other words, a read operation across the global bus always means a read by the pixel alu; that is, data is transferred from a dram bank into the pixel buffer. similarly, a write operation across the global bus means data is updated from the pixel buffer to a dram bank. this is also specifically noted in figure 1.6 by the signals global bus write block enable and global bus read block enable. figure 1.6 tri-port pixel buffer, global bus and dual-port dirty tag ram 8 blocks x 32 bits 32-bit plane mask 8 blocks x 256 bits write enable logic 32 32 32 256 256 3 3 3 3 3 block address 32 32 32 from pixel alu 256 global bus (pixel buffer to dram) write buffer global bus read block enable read/write port 0000 h write block enable dirty tag ram from pixel alu to pixel alu to dram sense amps pixel buffer (dram to pixel buffer) enable global bus write port read port from dram_a [8:6] block address from palu_a [5:3] word address from palu_a [2:0]
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 9 1 overview of 3d-ram and its functional block s pixel alu basics the pixel alu consists of four 8-bit rop/blend units, which may be independently programed to perform either a raster operation or a blending function, one 32-bit match compare unit, and one 32-bit magnitude compare unit. the two compare units are also commonly referred to as the dual compare units. the motivation for including the pixel alu on chip is to convert the interface from a read-modify-write interface to a write-mostly interface. this logic integration with memory arrays greatly improves rendering throughput by avoiding time consuming reads and direction changes on the data bus. the rop/blend units and the dual compare units are highly pipelined. page 11 contains a brief discussion of the alu pipeline. the output of a rop/blend unit is conditionally written to the pixel buffer, depending on the comparison results from the on-chip dual compare units and from the dual compare units of the preceding 3d-ram chips. for example, for a 1280 x 1024 x 32 double- buffered graphics system with 32-bit z buffer, there are effectively 96 bits per pixel. in this case, eight 3d-rams are used as color chips and four as z chips. the pixel alus of the z chips perform magnitude comparisons and feed the comparison results via their pass_out pins to the figure 1.7 pixel alu (pipeline stages are not shown.) rop/ 8 18 8 palu_dx [3:0] pixel buffer 36 blend unit 0 rop/ blend unit 1 old data, byte 0 9 rop/ blend unit 2 rop/ blend unit 3 compare unit input data, byte 3 and byte 0 plus ext. bits constant data, bit 1 of extension bits plus byte 1 constant data, bit 0 of extension bits plus byte 0 constant data, bit 2 of extension bits plus byte 2 constant data, bit 3 of extension bits plus byte 3 36 32 8 18 8 9 8 18 8 9 8 9 8 9 32 32 old data, byte 1 old data, byte 2 old data, byte 3 old data input data input data, byte 3 and byte 2 plus ext. bits input data, byte 3 and byte 1 plus ext. bits pass_out input data, byte 3 plus ext. bit 32 constant data 32 pass_in alu read port constant register dual palu_dq [31:0] alu write port
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 10 1 overview of 3d-ram and its functional blocks corresponding color chips. it is important to note that due to the pipelining, the color chips do not wait for the magnitude comparison results from the z chips; rather, the results of the rop/ blending operations and comparison operations on the color chips, and the results of the magnitude comparison on the z chips all are presented to the pixel buffer of the color chips in the same clock cycle. in this sense, the rendering controller can accomplish a pixel blending operation with z compare and window id compare all in a single clock cycle. furthermore, because of the pipelining and the tri-ported architecture of the pixel buffer, the read and write operations may be performed on the pixel buffer of the 3d-ram during the same clock cycle. rop/blend units the rop/blend units can be configured as either a rop unit or a blend unit by setting a register bit. each rop unit can perform all 16 standard rop functions. these functions are listed in chapter 3. one of the operands of the rop functions is the old data from the pixel buffer, and the other operand may be either the data from the primary i/ o pins or the data from an internal register (called the constant register). for the blending operation, the general equation is as follows: write data to pixel buffer = new term + (old data x old fraction) = (new data x new fraction) + (old data x old fraction) the 3d-ram blend units accomplish what is called destination blending in a single mclk cycle, that is, the addition and the second multiplication in the above equation. in this case, the rendering controller must perform the multiplication of new data with new fraction (i.e., the source blending) and present the result as the new term to 3d-ram. in addition, 3d-ram can also accomplish the full blending by taking two mclk cycles, with a loop back mechanism. dual compare unit physically, the dual compare units consist of one 32-bit match compare unit and one 32-bit magnitude compare unit. both match compare and magnitude compare are done in parallel. one of the sources is always the old data from the pixel buffer. the other source is independently selectable between the data from the palu_dq pins and the data from the constant source register. there are also two mask registers, namely match mask and magnitude mask, that define which bits of the 32-bit words will be compared and which will be ?on? care.? one application of the match compare unit is window id comparison, and the magnitude compare unit is typically used in the depth comparison of a z-buffer algorithm for hidden surface removal. when these compare units are used together, the system can achieve hidden surface removal for only a specific window on the display in one cycle. furthermore, since the data to be written into the pixel buffer always comes through the rop/blend units, a system with 3d-ram can achieve a pixel update with a raster or blending operation specifically on only the new objects in the selected window that are closer to the viewer than the existing objects in the frame buffer. the results of both match compare and magnitude compare operations are logically anded together to generate the pass_out pin. the pass_in signal (fed from another 3d-ram chip) and the internally generated pass_out signal are then logically anded together to produce a write enable signal to the pixel buffer. thus, the pass_in and pass_out pins offer hardware support for display resolutions where multiple 3d-ram chips are required, such as in the cases of 1280x 1024 x 32 (single color buffer plus z buffer) and 1280 x 1024 x 96 (double color buffer plus z buffer).
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 11 1 overview of 3d-ram and its functional block s pipelining the 3d-ram pixel alu pipeline is designed so that read and write operations can be performed with minimal delay. this is achieved by having all operations conform to a uniform 7-stage pipeline. figure 1.8 is an example that illustrates the efficiency afforded by the pipeline flow of pixel alu read/write operations. a pipeline stage begins with a rising edge of mclk and ends before the next rising edge of mclk. (in 3d-ram, all references to mclk are relative to the rising edge except for some boundary scan test operations.) for clarity, separate stage counts are provided for the first read and first write operations and are labeled as r1 through r4 and w1 through w7, respectively. the read a operation is asserted for two cycles; read a is first presented in stage r1 and latched into the 3d-ram by clock 1 in stage r2. data a is piped out by clock 2 in stage r3 and becomes stable for sampling in stage r4. between read b and wc (write c), two single-cycle nops are inserted to guarantee an idle cycle for the data bus to turn around. on the other hand, a read operation can immediately follow a write operation, as shown by read g following wf. to allow maximum bandwidth for the rendering controller, a write operation may be started everything cycle. in this example, we start with the wc operation. the address and write instruction are presented in stage w1 and latched into the 3d-ram by clock 7 in stage w2; data c and wd are presented in stage w2 and latched into the 3d-ram by clock 8 in stage w3. then, after three cycles for internal processing, the valid pass_out pass c is piped out by clock 11 in stage w6. the actual updating of the pixel buffer takes place in stage w7. thus, n consecutive write operations take only 7 + n - 1 = n + 6 cycles to complete, including all internal activities. it is important to point out that the effective write cycle time from the perspective of the rendering controller interface is only n + 1 cycles for n consecutive write operations, as shown by wc through wf. figure 1.8 example of pixel port read/write operations that satisfy the pipeline ?w mclk 1 0 palu_a, palu_op, palu_be, palu_we, palu_en palu_dq, palu_dx pass_out 23456 r1 r2 r3 r4 w1 w2 w3 w4 w5 w6 w7 w8 7 8 9 10 11 12 13 14 15 pass c pass e pass f pass d read a wc read b we nop wf wd read g a data c b data e data f data d g hit m1029
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 12 1 overview of 3d-ram and its functional blocks the picking logic from the user? view point, a common experience of the picking function in 2d computer graphics may be using the mouse and the associated cursor to select an icon on the display screen, resulting in the selected icon highlighted in a different color. this is a basic function in interactive computer graphics, and 3d-ram provides the picking logic and the hit pin to support this picking function for selection of objects in a 3d scene. a picking function may involve redrawing the objects into the frame buffer and returning a list of objects that intersect with some predefined selection volume. when the user uses multiple 3d-rams in a frame buffer design to determine if a pixel data is successfully written by any stateful write operation (see ?ixel data operations?on page 40) during the redraw process, the comparison result on the pass_out pin from each chip must be logically anded. if this logical operation is left to off-chip glue logic between the 3d-ram frame buffer and the rendering controller, excessive delay is unavoidable in this critical timing path. if the rendering controller is to perform this logical operation, extra pins must be provided by both the 3d-ram and the rendering controller, while delay is still significant. the picking logic brings the glue logic on chip and provides an open-drain hit pin to interface with the rendering controller. a block diagram of the picking logic is shown in figure 1.9. initially, the picking logic should be enabled and the hit flag should be cleared, which is done by writing to byte 3 of the compare control register. the hit pin will be set to high (i.e., not driven low by 3d-ram) after seven cycles (corresponding to the pipeline stage 8). in the figure below, this is indicated by the number 8 in the square box above the hit pin label. this design of the pipeline flow for the hit flag and the hit pin prevents an incorrect hit value from the stateful data write operations before the picking logic is enabled. a sequence of stateful data write operations may be issued immediately after the register writing. a low value on the hit pin means that at least one of the stateful data writes passed the on-chip and off-chip comparison tests and the pixel data was written to the pixel buffer. if the hit pin is high, none of the stateful data writes passed and no pixel is updated. see figure 8.6, ?icking logic timing,?for an illustration of the operations described in this section. figure 1.9 block diagram of the picking logic 0 1 d25 q d q d q d d24 0 1 d27 q d d26 hit flag pick enable set hit flag (open drain) compare control register stateful_we pass_in pass_out 7 8 m1040 hit
pin descriptions and pinouts 2

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 13 2 pin descriptions and pinouts pin descriptions and pinouts this chapter describes the 3d-ram pins. unless otherwise specified, all signals comply with the low voltage ttl (lvttl) standard. the functional block diagram in figure 2.1 shows all i/o signals on the external pins. the master clock mclk synchronizes all operations of the pixel alu control and dram covntrol. the video control specifies the video interface. the test access port is used for the jtag (joint test action group) boundary scan. the following sections describe each signal in detail. common pins these signals are common to several sections of the 3d-ram. table 3.1 common control signals mclk the master clock mclk is used for timing synchronization of internal circuitry. all exter- nal timing parameters, except video output operation and boundary scan, are specified with respect to the mclk rising edge. reset the reset pin is an active low asynchro- nous signal used for power up and restart ini- tialization. during power-up, the reset signal should be held low for at least 500 m s after stable v dd , so that the internal power supply can be stabilized. after the reset signal goes high, nine idle cycles must elapse before the internal registers can be reset to default values. the power-up reset procedure is illustrated in figure 8.1. when the reset signal is asserted low during normal opera- tions, a restart reset sequence begins. the restart reset includes resetting registers in nine idle cycles and initializing dram array as in the power-up reset. the restart reset sequence is shown in figure 8.3. in dram array initialization, the access page (acp) operation should be performed on one page for every dram bank, followed by the pre- charge bank (pre) operation for every bank. figure 8.3 shows two approaches to initializ- ing the dram array. signal name pin count i/o description mclk 1 i master clock reset 1 i reset total 2
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 14 2 pin descriptions and pinouts figure 2.1 3d-ram functional block diagram with external pins palu_en palu_we palu_op palu_a palu_be 2 3 6 pass_out pass_in hit 4 dram bank a video buffer i dram bank c 640 640 dram bank b video buffer ii dram bank d 640 640 global bus 256 16 dram control dram_en dram_op dram_bs dram_a mclk reset 3 2 9 pixel control video control vid_clk vid_cke vid_qsf vid_q vid_oe sram pixel buffer 32 32 palu_dq test access port scan_rst scan_tck scan_tms scan_tdi scan_tdo 32 4 palu_dx alu m1028 2
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 15 2 pin descriptions and pinouts pixel alu interface these signals control the pixel alu and pixel buffer. table 3.2 pixel alu control signals palu_en [1:0] the palu_en [1:0] pins must be ?1? to start a pixel alu operation. if either palu_en pin is ?? then all other pixel alu pins are ignored. palu_we the palu_we indicates a write operation when high (?? and a read operation when low (??. palu_op [2:0] the palu_op [2:0] pins, together with palu_we, specify the operation to be per- formed. see table 3.4 for the pixel alu oper- ation encoding. palu_a [5:0] the palu_a [5:0] pins provide an address for the specified operation. palu_be [3:0] the palu_be [3:0] pins apply to all read and write operations, including register writes and dirty tag writes. if palu_we is low ?? indi- cating a read, the palu_be pins are per byte output enables. if palu_we is high ?? indi- cating a write, the palu_be pins are per byte write enables. palu_be0 controls palu_dq [7:0] ; palu_be1 controls palu_dq [15:8] ; palu_be2 controls palu_dq [23:16 ]; and palu_be3 controls palu_dq [31:24] . palu_dq [31:0] data is read from or written to the palu_dq[31:0] pins. the write address of pixel buffer may be input from palu_dq [29:24] in some modes of operation. see ?n application of the write address control register?on page 62. signal name pin count i/o description palu_en 2 i enable pixel alu operation starting next cycle palu_we 1 i pixel alu write enable palu_op 3 i pixel alu opcode palu_a 6 i read/write address palu_be 4 i byte write or output enable palu_dq 32 i/o data pins palu_dx 4 i data extension pins for blending pass_out 1 o compare output (special signal level, see table 7.2) pass_in 2 i compare input (special signal level, see table 7.2) hit 1 o picking logic ?g output (open-drain, see table 7.2) total 56
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 16 2 pin descriptions and pinouts palu_dx [3:0] extra high-order bits of palu_dq data are provided by palu_dx [3:0] . palu_dx0 is associated with palu_dq[ 7:0] ; palu_dx1 is associated with palu_dq [15:8] ; palu_dx2 is for palu_dq [23:16] ; and palu_dx3 is for palu_dq [31:24] . pass_out the comparison result of the dual compare unit is output on the pass_out pin. pass_out is low (?? only when the pixel alu operation during the fifth stage of pixel alu pipeline is a stateful initial/normal data write operation (see ?ixel data operations? on page 40) and when either match compari- son or magnitude comparison fails. other- wise, pass_out is high (??, indicating either the pixel alu operation is not a state- ful initial/normal data write or both compari- son tests passed during the stateful initial/ normal data write. pass_in [1:0] when the pass_in [1:0] pins are high (?1? and the internal comparison test also passes (pass_out is high (??), data is written to the pixel buffer if the pixel alu operation is a stateful normal/initial data write. each of the pass_in[1:0] pins may be individually masked by the pass_ins select register bits 0 and 8, pins[0, 8], respectively. hit the hit pin is an open-drain, active low out- put. this pin reflects the internal status of the hit flag. see ?ompare control register (ccr [31:0] )?on page 36 for a detailed description. dram control these signals command operations on the four dram banks, global bus and video buffer. table 3.3 dram control signals dram_en when dram_en is high (?? at the rising edge of mclk, a dram operation is initiated at the next clock cycle. only the selected dram bank is enabled. dram_op [2:0] the dram opcode dram_op [2:0] specifies the dram operation. see table 4.1 for the dram operation encoding. dram_bs [1:0] dram_bs [1:0] is used to select one out of four banks. the selection codes are: ?0?for bank a, ?1?for bank b, ?0?for bank c, and ?1?for bank d. signal name pin count i/o description dram_en 1 i enable dram operation at next cycle dram_op 3 i dram opcode dram_bs 2 i dram bank select dram_a 9 i address for page, block, and video line total 15
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 17 2 pin descriptions and pinouts dram_a [8:0] the address pins dram_a [8:0] are used to select one of the following: (i) a page in a dram bank, (ii) a block of data to be trans- ferred between the sense amplifiers of a dram bank and the pixel buffer over the global bus, or (iii) 80 bytes of video data from the sense amplifiers of a dram page to a video buffer. details are described in chap- ter 4, ?ram operations. video interface these signals interface with a video ramdac chip. table 3.4 video signals vid_clk vid_clk is a free running or gated video shift clock. vid_cke vid_cke is a synchronous vid_clk enable signal. when vid_cke is high (??, the next vid_clk cycle will be enabled. the video counter will also be enabled in the next cycle. vid_oe vid_oe is an asynchronous video output enable for vid_q. the video data bus is enabled when vid_oe is high (??. vid_q [15:0] with 16-bit video data bus vid_q [15:0] , two bytes of data can be clocked out on the same cycle. in the 8-bit video buffer, the output for- mat is arranged as even bytes on vid_q [7:0] and odd bytes on vid_q [15:8] . a detailed description of the two output data formats, normal mode and reversed mode, is in ?ideo output operation?on page 48. vid_qsf the vid_qsf output indicates which video buffer is currently providing video data. vid_qsf is low (?? when video buffer i is shifting data out. vid_qsf is high (?? when video buffer ii is shifting data out. test access port these signals interface to the test access port for partial compliance with the ieee standard 1149.1 test access port and boundary scan?can architecture. each of the three input pins scan_rst , scan_tms, and scan_tdi have an internal pull-up resistor of 10-kohm. see chapter 10, ?tag boundary scan,?for more details. table 3.5 serial test signals signal name pin count i/o description vid_clk 1 i video clock vid_cke 1 i video clock enable vid_oe 1 i video output enable vid_q 16 o video data bus vid_qsf 1 o video buffer indicator total 20 signal name pin count i/o description scan_rst 1 i scan reset scan_tck 1 i scan clock scan_tms 1 i scan test mode select scan_tdi 1 i scan test data input scan_tdo 1 o scan test data output total 5
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 18 2 pin descriptions and pinouts power & ground there are 13 power supply pins and 16 ground pins. the nc pin should not be connected. table 3.6 power and ground 3d-ram pinouts there are two pinouts for 3d-ram: normal pinout with pin 1 located at the lower left hand corner and specially marked by a small circle; and reverse pinout with pin 1 located at the upper left hand corner and marked by a large circle and a pointing triangle. the device in normal pinout is designated by the letters ?p?in the product number, and the device in reverse pinout by the letters ?f.?in both pinouts, the mapping of pin number with pin name is identical. tracking label on the top surface of the 3d-ram package, a tracking label is printed below the mitsubishi logo and the 3d-ram product number. the tracking label consists of 7 numbers followed by a dash and a speed/power grade designation and is represented by the mnemonic ?ddmmmmm-nn? this mnemonic is explained below. ddd: data code mmmmm: manufacturing code nn: ?0a t clk (min) = 10 ns ?0 t clk (min) = 10 ns for all operations except t clk (min) = 12 ns for alpha saturate logic ?2 t clk (min) = 12 ns signal name pin count description v ss 16 ground v dd 13 power supply nc 1 no connection total 30
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 19 2 pin descriptions and pinouts normal pinout diagram scan_tms scan_tck scan_rst vid_q 8 vid_q 9 v ss vid_q 10 vid_q 11 vid_q 12 vid_q 13 v dd vid_q 14 vid_q 15 vid_qsf vid_cke v ss v ss pass_in 0 v dd vid_clk v ss pass_in 1 v ss vid_oe hit vid_q 0 vid_q 1 v dd vid_q 2 vid_q 3 vid_q 4 vid_q 5 v ss vid_q 6 vid_q 7 scan_tdo scan_tdi v dd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 v dd palu_dq 27 palu_dq 26 palu_dq 25 palu_dq 24 v ss palu_dq 23 palu_dq 22 palu_dq 21 palu_dq 20 v dd palu_dq 19 palu_dq 18 palu_dq 17 palu_dq 16 v ss v ss pass_out v dd mclk nc v ss v ss palu_dq 15 palu_dq 14 palu_dq 13 palu_dq 12 v dd palu_dq 11 palu_dq 10 palu_dq 9 palu_dq 8 v ss palu_dq 7 palu_dq 6 palu_dq 5 palu_dq 4 v dd 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 v dd reset dram_bs 1 dram_bs 0 dram_a 8 dram_a 7 dram_a 6 dram_op 2 dram_op 1 v ss palu_a 5 palu_a 4 palu_a 3 palu_en 1 palu_we palu_op 2 v ss palu_be 3 palu_be 2 palu_dx 3 palu_dx 2 palu_dq 31 palu_dq 30 palu_dq 29 palu_dq 28 v dd 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 m1048 M5M410092Bfp v dd dram_a 0 dram_a 1 dram_a 2 dram_a 3 dram_a 4 dram_a 5 dram_en dram_op 0 v ss palu_a 0 palu_a 1 palu_a 2 palu_en 0 palu_op 0 palu_op 1 v ss palu_be 0 palu_be 1 palu_dx 0 palu_dx 1 palu_dq 0 palu_dq 1 palu_dq 2 palu_dq 3 v dd 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 dddmmmmm-nn
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 20 2 pin descriptions and pinouts reverse pinout diagram scan_tms scan_tck scan_rst vid_q 8 vid_q 9 v ss vid_q 10 vid_q 11 vid_q 12 vid_q 13 v dd vid_q 14 vid_q 15 vid_qsf vid_cke v ss v ss pass_in 0 v dd vid_clk v ss pass_in 1 v ss vid_oe hit vid_q 0 vid_q 1 v dd vid_q 2 vid_q 3 vid_q 4 vid_q 5 v ss vid_q 6 vid_q 7 scan_tdo scan_tdi v dd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 v dd palu_dq 27 palu_dq 26 palu_dq 25 palu_dq 24 v ss palu_dq 23 palu_dq 22 palu_dq 21 palu_dq 20 v dd palu_dq 19 palu_dq 18 palu_dq 17 palu_dq 16 v ss v ss pass_out v dd mclk nc v ss v ss palu_dq 15 palu_dq 14 palu_dq 13 palu_dq 12 v dd palu_dq 11 palu_dq 10 palu_dq 9 palu_dq 8 v ss palu_dq 7 palu_dq 6 palu_dq 5 palu_dq 4 v dd 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 v dd reset dram_bs 1 dram_bs 0 dram_a 8 dram_a 7 dram_a 6 dram_op 2 dram_op 1 v ss palu_a 5 palu_a 4 palu_a 3 palu_en 1 palu_we palu_op 2 v ss palu_be 3 palu_be 2 palu_dx 3 palu_dx 2 palu_dq 31 palu_dq 30 palu_dq 29 palu_dq 28 v dd 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 m1003 M5M410092Brf dddmmmmm-nn v dd dram_a 0 dram_a 1 dram_a 2 dram_a 3 dram_a 4 dram_a 5 dram_en dram_op 0 v ss palu_a 0 palu_a 1 palu_a 2 palu_en 0 palu_op 0 palu_op 1 v ss palu_be 0 palu_be 1 palu_dx 0 palu_dx 1 palu_dq 0 palu_dq 1 palu_dq 2 palu_dq 3 v dd 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
pixel alu operations 3

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 21 3 pixel alu operations pixel alu operations this chapter discusses details on the elements and operations of the pixel buffer and pixel alu in 3d-ram. an operation that involves only the pixel alu and the pixel buffer is called a pixel alu operation. an operation that involves a dram array is categorized as a dram operation and is described in chapter 4, ?ram operations.? all registers of the 3d-ram are defined and explained in this chapter. elements of the pixel buffer block and word as stated in chapter 2, the 2,048-bit pixel buffer is organized into eight 256-bit blocks. during a dram operation, these blocks can be addressed from the dram_a pins for block transfers on the global bus. during a pixel alu operation, the 32-bit pixel alu accesses the pixel buffer, requiring not only the block address be specified but also the 32-bit word be identified. this is done via the 6-bit palu_a pins. the upper three bits select one of eight blocks in the pixel buffer, and the lower three bits specifies one of the eight words in the selected block. the availability of both the dram_a and palu_a pins allows concurrent dram and pixel alu operations. since a word is mapped directly to palu_dq [31:0] , palu_dq [7:0] is byte 0, palu_dq [15:8] is byte 1, palu_dq [23:16] is byte 2, and palu_dq [31:24] is byte 3. figure 3.1 is a simplified block diagram of these pixel buffer elements. figure 3.1 pixel buffer elements 01234567 01234567 dirty tag ram pixel buffer 08 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 23 45 67 0 1 2 3 4 5 67 dirty tag for block 0 block 0 of pixel buffer 7:0 15:8 23:16 31:24 plane mask word 0 in block 0 7:0 15:8 23:16 31:24
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 22 3 pixel alu operations dirty tag each data byte of a 256-bit block is associated with a dirty tag bit. this means that each 4-byte word is associated with four dirty tag bits and that a 32-bit dirty tag memory controls the corresponding 32-byte block data. the dirty tag ram in the pixel buffer contains eight such 32-bit dirty tags. there are three aspects of dirty tag operations: tag clear, tag set, and tag initialization. in normal operation modes, the clearing and setting of the dirty tag by these read and write operations are done by the on-chip logic in the 3d-ram and are essentially transparent to the rendering controller. the dirty tag bits are used by the 3d-ram internally and are not output to the external pins. when data is transferred from the sense amplifiers of a dram bank to a pixel buffer block over the global bus (i.e., a read block transfer which is a dram operation and is described in the next chapter), all 32 dirty tag bits associated with the selected pixel buffer block are cleared to ?? when data is transferred from a pixel buffer block to the sense amplifiers of a dram bank (i.e., a write block transfer, another dram operation), the dirty tag determines which data bytes can be written into the sense amplifiers. when a dirty tag bit is ?? the corresponding data byte is written under the control of the plane mask register (see the following section). when a dirty tag bit is ?? the corresponding byte of data in the dram bank is not written and retains its former value. because the dirty tag prevents the unaltered bytes of a 256-bit block from being written into a dram bank, the power consumption of a write block transfer may be reduced by as much as 50%. this may be a significant power saving when a high-resolution display is constantly redrawn, such as in the case of high-quality full- screen animation. when a data word is read from the 32-bit alu port of pixel buffer, none of the 32-bit dirty tags is affected or has any effect on the out-going data. the setting and initialization of the dirty tags are described in the paragraphs below. table 3.1 pixel alu operations involving dirty tags pixel operation pixel data new dirty tag contents (stateful/ stateless)normal data write write bytes 0 to 3 from palu_dq pins (per palu_be pins) the four addressed dirty tag bits are ored with palu_be [3:0] ; the other 28 dirty tag bits are unchanged. (stateful/ stateless)initial data write write bytes 0 to 3 from palu_dq pins (per palu_be pins) palu_be [3:0] is written to the 4 addressed dirty tag bits; ? is written to the 28 unaddressed dirty tag bits. replace dirty tag unchanged palu_dq [31:0] replaces 32 dirty tag bits. or dirty tag unchanged all 32 dirty tag bits are ored with palu_dq [31:0] .
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 23 3 pixel alu operations the dirty tag bits play an important role for all four write operations of the pixel alu to the pixel buffer: stateful/stateless initial data write and stateful/stateless normal data write. (these operations are also explained in ?ixel alu operations?on page 56.) since the pixel alu operations conform to the 7-stage pipeline, the byte enable palu_be [3:0] data also gets into the pipeline when the operation is issued. at the end of the pipeline, pixel data is written into a pixel buffer word and palu_be [3:0] pins can change the four corresponding dirty tag bits. in the initial data write operation, the four addressed dirty tag bits are replaced with palu_be [3:0] , while the other 28 dirty tag bits for the same block are cleared to ?? in the normal data write operation, each of the four addressed dirty tag bits is set to ??only when the corresponding palu_be pin is ?? an addressed dirty tag bit is unchanged if the corresponding palu_be pin is ?? the other 28 dirty tag bits for the same block are also unchanged. the 32 dirty tag bits for a particular block can all be replaced with the palu_dq data through the pixel alu operation ?eplace dirty tag.?another pixel alu operation ?r dirty tag?changes the dirty tag contents for an addressed block with the result of the bitwise ?r?function on the original dirty tag data and the palu_dq [31:0] data. the bit mapping between the dirty tag and palu_dq pins is illustrated in figure 3.1. for example, to change the dirty tag bits for word 0, the data should be placed on palu_dq0, palu_dq8, palu_dq16, and palu_dq24. to change the dirty tag bits for word 5, the data should be on palu_dq5, palu_dq13, palu_dq21, and palu_dq29. the following sub-section provides an application of these ?eplace dirty tag?and ?r dirty tag?operations. using dirty tag for color expansion many 2d rendering operations, such as text drawing, involve writing the same color to many pixels. these operations can be greatly accelerated by specifying individual pixels with a single bit and having hardware automatically expand each bit to an entire pixel. in 3d-ram color expansion is done with the dirty tags associated with the pixel buffer blocks. the pixel color is written eight times to a pixel buffer block so that all of the pixels in the block are the same color. next, a 32-bit word is written to the dirty tag of the associated block. finally, the block is written to a dram bank. the pixel whose corresponding dirty tag bit is set is changed to the new color. the other pixels are unaffected. a new 32-bit word may be written to the dirty tag afterwards, and the same pixel buffer block may be written to a different part of the dram array. thus, one pixel buffer block can be used to hold the foreground color and used repeatedly to write text to the frame buffer. plane mask the 32-bit plane mask register (pm [31:0] ) is used to qualify two write functions: (1) as per-bit write enables on 32-bit data for a stateful (initial/ normal) data write operation from the pixel alu to the pixel buffer; (2) as per-bit write enables on 256-bit data for a masked write block (mwb) operation from the pixel buffer to the sense amplifiers of a dram bank over the global bus. for a stateful data write, the plane mask serves as per-bit write enables over the entering data from the pixel alu write port; bit 0 of the plane mask enables or disables bit 0 of the incoming 32-bit pixel data, bit 1 of the plane mask enables or disables bit 1 of the incoming 32-bit pixel data, and so on. for a masked write block operation on the global bus side, when a pixel buffer block is transferred out to the dram, the 32-bit plane mask applies to every 32-bit word as per-bit write enables. in other words, bit 0 of the plane mask enables or disables bits 0, 32, 64, 96, 128, 160, 192, and 224 of the 256-bit block; bit 1 of the plane mask enables or disables bits 1, 33, 65, 97, 129, 161, 193, and 225.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 24 3 pixel alu operations a particular sense amplifier bit can be written only if both the dirty tag bit and the plane mask bit are logically ?? this kind of relationship among multiple enables and block data is illustrated in figure 3.2 for the first 40 bits (which are word 0 and byte 0 of word 1) of the global bus. it is important to note the simultaneous effects of the plane mask. although 3d-ram allows concurrent operations of pixel alu and dram, the user is cautioned that there is only one set of plane mask bits that can affect both pixel alu write and dram write operations at the same time. when different plane maskings are required, concurrent pixel alu stateful data write operations and dram masked write block operations must be avoided. once the plane mask is written, the new plane mask is effective for only the stateful data write operations issued at later cycles, thereby conforming to the uniform 7-stage pipeline rule. the plane mask register is loaded through a pixel alu ?rite control register?operation. the mapping of the plane mask to the palu_dq pins is the same as the word data to the pins (see also the section on ?lock and word?on page 21). figure 3.2 the relationship between dirty tags and plane mask for ?st 40 bits of the global bus. (both the dirty tag bit and the plane mask bit must be 1 before a particular sense amp bit can be written.) 0 8 2 24 1 31 0 1 3 16 4 5 6 7 31 0 39 . . . . . . dirty tag bits plane mask bits 31 0 39 31 7 15 23 7 15 23 sense amplifiers of a dram bank sense amplifiers of a dram bank
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 25 3 pixel alu operations elements of the pixel alu chapter 2 presented an overview of the pixel alu, with an emphasis on the motivation and applications of the elements in the pixel alu. in this section, some of the same information is repeated, but the emphasis is on detailed technical specification. the elements of the pixel alu are four 8-bit rop/ blend units, one 32-bit match compare unit, one 32-bit magnitude compare unit, and the picking logic. figure 3.3 shows the inputs to the rop/ blend units and the dual compare unit. in the figure, bus ??is the old data from the pixel buffer; ??and ?x?are from the palu_dx and palu_dq pins, respectively; and finally, buses ?x?and ??are from the internal 36-bit constant source register, with ?x?being the most significant four bits. the inputs to the dual compare unit are straightforward. the inputs to the rop/blend units are explained in the following sub-sections. figure 3.3 pixel alu (pipeline stages are not shown) rop/ 8 18 8 palu_dx [3:0] pixel buffer 36 blend unit 0 rop/ blend unit 1 o [7:0] 9 rop/ blend unit 2 rop/ blend unit 3 compare unit {nx 3 , n [31:24] , nx 0 , n [7:0] } {kx 1 , k [15:8] } {kx 0 , k [7:0] } {kx 2 , k [23:16] } {kx 3 , k [31:24] } 36 32 8 18 8 9 8 18 8 9 8 9 8 9 32 32 o [15:8] o [23:16] o [31:24] o [31:0] n [31:0] {nx 3 , n [31:24] , nx 2 , n [23:16] } {nx 3 , n [31:24] , nx 1 , n [15:8] } pass_out {nx 3 , n [31:24] , nx 3 , n [31:24] } 32 k [31:0] 32 pass_in [1:0] alu read port constant source dual palu_dq [31:0] alu write port
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 26 3 pixel alu operations rop/blend units each rop/blend unit can be independently configured as either a rop unit or a blend unit through the programming of the rop/blend control register. each rop unit can perform all 16 standard rop functions, which are listed in table 3.16. rop functions are performed on a byte of the old data (?? from the pixel buffer and a byte of the new term, which is either the data from the data pins (?? or the data from the constant source register (??. pixel alu blend factor selections (new) for the blending operation, the general equation is as follows: write data to pixel buffer = new term + (old data x old fraction) = (new data x new fraction) + (old data x old fraction) to each blend unit, an addend (e.g. the ?ew term?or 00h) is input from the palu_dx and palu_dq pins (marked as {?x? ??), from the pixel buffer (marked as ??, or from the constant source register (marked as {?x? ??). multiplicand 1 (marked as ?ultp1? is the fraction term and is from one of five sources; multiplicand 2 (marked as ?ultp2? is the data term and is from one of six sources. see table 3.5 for a complete selection mapping of the addend and multiplicands. in opengl terminology (see ?he opengl graphics system: a specification (version 1.1)?, new data represents the color values of the source (src_color) which enter the pixel alu path from the palu_dq pins; old data represents the color values of the destination (dst_color) which are from the pixel buffer. new fraction is known as the source blend factor (sfactor); old fraction is known as the destination blend factor (dfactor). the color values, src_color and dst_color, can be represented in rgba quadruplets form as (rs, gs, bs, as) and (rd, gd, bd, ad), respectively. define sfactor and dfactor as (sr, sg, sb, sa) and (dr, dg, db, da), respectively. the blending equation can be rewritten as: write data to pixel buffer = (src_color x sfactor) + (dst_color x dfactor) = (rs, gs, bs, as) x (sr, sg, sb, sa) + (rd, gd, bd, ad) x (dr, dg, db, da) = (rsxsr+rdxdr, gsxsg+gdxdg, bsxsb+bdxdb, asxsa+adxda) all the possible values for opengl blending factors are listed in table 3.2. the subtraction of quadruplets means subtracting them componentwise. the column ?elevant factor? indicates whether the corresponding parameter can be used to specify the source or destination blend factor. the first 11 rows in the table are parameters in the opengl specification 1.1; the last 4 rows are parameters specified by the opengl imaging extension gl_ext_blend_color. the full blending function requires two multiplications and one addition for each of the four components in the quadruplet. enumerating the twelve possible values of destination blend factor and thirteen possible values of source blend factor, we arrive at the 156 blending factor selection pairs illustrated in the matrix in table 3.3. 72 of these 156 pairs are required by the opengl specification 1.1 and the others are from the extension gl_ext_blend_color. the majority of applications use a small number of pair combinations. most of the blending with (0,0,0,0) or (1,1,1,1) as the blending factor can be realized with a half blender, meaning that they only require one multiplication and the addition in 3d-ram. furthermore, if one of the multiplications does not require destination colors or destination alpha from the frame buffer, this multiplication can be performed inside the rendering controller without having to read the destination data out of the frame buffer. thus, only a half blender is needed inside the 3d-ram to complete the blending equation in these cases. for the rest of the blending factor selections, the blending function
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 27 3 pixel alu operations can be completed in two consecutive cycles using the 3d-ram? two-cycle blend operation by looping back during the first cycle one of the two product terms in the equation on the preceding page combining the looped-back product term with the other product term during the second cycle. 3d-ram accelerates 44 blending factor selection pairs in single clock cycle throughput by half blending and the other 112 blending factor pairs in two cycles. in addition, there are five cases, two- cycle blending factor pairs may be accelerated in just one cycle if the alpha blending can be ignored. blending operation (new) the simplified block diagram of the blending unit is illustrated in figure 3.4. to execute a single- cycle blending operation, the multiplicands and addend (multp1, multp2, addend) must be selected by programming the rop/blend and blend_2 control registers. also, the rop/blend control register must be set for blending. once these registers are set, the blending operation is accomplished by performing a stateful write operation. each blend unit first performs the multiplication of multp1 and multp2 and then the addition of the resulting product with the addend, thereby completing a half blend. to execute a two-cycle blend operation, it is necessary to program these registers. the preblend control register selects multp2 for the preblend cycle (the first cycle during the two- cycle blend operation) and the addend for the normal cycle (the second cycle of the two-cycle blend operation). during the preblend cycle, addend is fixed to the {palu_dx and palu_dq} or {kx, k} bus, and multp1 is fixed to the {palu_dx, palu_dq} bus. the rop/ blend and blend_2 control registers are programmed to select multp1 and multp2 components for the normal cycle; the addend selected by these two registers is ignored by the adder for the preblend cycle and may be ?ooped back?as one of the two choices for addend during the normal cycle. once these three table 3.2 source and destination blending factors opengl parameter relevant factor computed blend factor gl_zero source or destination (0,0,0,0) gl_one source or destination (1,1,1,1) gl_dst_color source (rd,gd,bd,ad) gl_src_color destination (rs,gs,bs,as) gl_one_minus_dst_color source (1,1,1,1) ?(rd,gd,bd,ad) gl_one_minus_src_color destination (1,1,1,1) ?(rs,gs,bs,as) gl_src_alpha source or destination (as,as,as,as) gl_one_minus_src_alpha source or destination (1,1,1,1) ?(as,as,as,as) gl_dst_alpha source or destination (ad,ad,ad,ad) gl_one_minus_dst_alpha source or destination (1,1,1,1) ?(ad,ad,ad,ad) gl_src_alpha_saturate source (f,f,f,1); f=min(as, 1?d) gl_constant_color_ext source or destination (rk,gk,bk,ak) gl_one_minus_constant_color_ext source or destination (1,1,1,1) ?(rk,gk,bk,ak) gl_constant_alpha_ext source or destination (ak,ak,ak,ak) gl_one_minus_constant_alpha_ext source or destination (1,1,1,1) ?(ak,ak,ak,ak)
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 28 3 pixel alu operations table 3.3opengl blending factor selection matrix blank space blank spa legend: ? = half blending in a single clock cycle; ? = full blending in two cycles using the two-cycle blend operation; ? = one cycle blending with alpha blending ignored. k = constant source register; a = addend; m1 = multp1; m2 = multp2 destination blend factor source blend factor zero (kx, k) one (m1) src_ color (m1) one_ minus_ src_ color (m1) src_ alpha (m1) one_ minus_ src_ alpha (m1) dst_ alpha (m2) one_ minus_ dst_ alpha (m2) constant_ color (m1) one_ minus_ constant_ color (m1) constant_ alpha (m1) one_ minus_ constant_ alpha (m1) zero (a) xxxxxxxxxxxx one (a) xx x o x c, o x x o o o o dst_color (m2) xxoooooooooo one_minus_ dst_color (m2) xxoooooooooo src_alpha (a) xx o o c, o c, o x x o o o o one_minus_ src_alpha (a) xx o o c, o c, o x x o o o o dst_alpha (m2) xxoooooooooo one_minus_ dst_alpha (m2) xxoooooooooo src_alpha_ saturate (m2) xxoooooooooo constant_color (a) xxoooooooooo one_minus_ constant_color (a) xxoooooooooo constant_alpha (a) xxoooooooooo one_minus_ constant_alpha (a) xxoooooooooo
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 29 3 pixel alu operations registers have been programmed, an ?nitial two- cycle blending?operation (palu_op=110, palu_we=1) should be performed and followed by a stateful initial data write or stateful normal data write operation on the same pixel location (i.e. palu_a with the same block and word address and palu_be [3:0] with the same enable settings). during the preblend cycle, multp1 and multp2 are multiplied and the result is ?ooped back?one stage in the pipeline. the addend is also ?ooped back?one stage. the addend and the multiplier output are then available as a possible addend for the next cycle. next, the stateful write is issued with the multiplicands selected by the rop/blend and blend_2 control registers. the addend selected by the rop/blend and blend_2 registers will be ignored. the blending occurs just as it would for a single-cycle operation except that the addend source is chosen to be either the ?ooped back? multiplier output or the ?ooped back?addend, based on the settings of the preblend control register. to help sort out the different sources for the various blending factors for both the single-cycle half blending and the two-cycle full blending, table 3.3 is notated with example sources of all opengl blending factors. for example, all blending factors related to the alpha component should be selected through the multp2 datapath; these include dst_alpha, one_minus_dst_alpha, and src_alpha_saturate and are notated with ?2?in their respective rows and columns. some source blending factors are not passed to 3d-ram directly, but rather the product of the source blending factor and the source color is passed to 3d-ram as the addend term; these include zero, one, src_alpha, and one_minus_src-alpha. figure 3.4 illustrates the above description with a simplified block diagram. the blocks labelled ?:nn? ?:n0? and ?x:n?on the blending path represent the manner in which the 8-bit blend units duplicate 4-bit data for the special (4,4,4,4) 16-bit color mode. specifically, ?:nn?means that the 4-bit data is nibble-wise duplicated to form an 8-bit data; ?:n0?means an 8-bit data is formed by padding the lower nibble with 0000b; and ?x:n?means a 4-bit data is produced by truncating the lower nibble, regardless of its value. more explanations may be found in the section on ?-bit to 8-bit expansion for pixel alu.? note that the special opengl stencil mode, which will be described in the section on ?tencil modes,?uses portions of rop/blend unit 3 to accomplish its functions. for simplicity, the stencil logic is not shown in figure 3.4 and, for the most part, can be thought of as a separate unit. it is important to note, however, that the stencil logic uses portions of the blending path and therefore, rop/blend unit 3 cannot be used for blending when the opengl stencil mode is being used. rop/blend units 0, 1 and 2 are identical, but unit 3 is slightly different because this unit typically handles the alpha data. the alpha-saturate block shown in figure 3.4 and figure 3.5 is only present in rop/blend unit 3. the result from the alpha- saturate block is routed to all four rop/blend units as a possible source of multp2. for the specifics of the data multiplexing and selections by the various register bits, refer to figure 3.6. the timing diagram of an example two-cycle blend operation is presented in figure 3.7.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 30 3 pixel alu operations figure 3.4 rop/blend unit n (pipeline stages ar not shown) figure 3.5 block diagram of the alpha-saturate unit rop mult 8 8 8 8 8 8 o [8n+7:8n] 9 9 9 9 8 {kx n ,k [8n+7:8n] } {nx 3 ,n [31:24] } 8 {nx n ,n [8n+7:8n] } 9 100h 9 clamp to alu write port of the intermediate result clamped result pixel buffer nx:n n:nn 8 n:n0 n:nn n [31:24] o [31:24] 1 stage delay 4 to 8 bit expansion (16-bit mode) multp1 multp2 addend 1 stage delay a- sat* * note: the a- sat block is only present in rop/blend unit 3. the result is pased to all four rop/blend units. 8 8 9 9 9 9 9 n [31:24] o [31:24] to all rop/blend units a u b l a>b 0 1 min(n [31:24] ,~o [31:24] ) a l b u ~o [31:28] ~o [27:24] n [31:28] n [27:24] note: this can be either a 4-bit or 8-bit comparator, based on the color depth select register compare 00 01 10 11 bld2 [29:28]
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 31 3 pixel alu operations figure 3.6 details of data selections in blend unit n by the various register bits 0 8 n [8n+7:8n] k [8n+7:8n] o [8n+7:8n] 8 1 0 out in 1 bld2 [8n] pbc [8n] (bit 9 is padded with "0") when palu_be n ="0", upper nibble is copied to lower nibble; when palu_be n ="1", lower nibble is copied to upper nibble. bld2 [8n] msb 9 9 9 8 8 rop 8 8 8 to alu write port of the pixel buffer 9 8 8 8 8 0 1 1 1 1 9 0 nx n o [8n+7:8n] kx n 1 preblend command from the previous clock cycle (palu_op = 110) xnx xn0 cds[0] rbc [3:0] cds [0] palu_be n 1 0 0 1 0 1 rbc [8n+5] 0 1 cds [0] 0 1 *byte nibble control logic not shown for (4.4.4.4) blending mode. **stencil logic not shown for the selection of rop or stencil op. ***pipeline stages not shown. rop/blend unit n 8 add clamp 8 9 1 msb 8 8 8 8 0 1 00 01 10 11 8 8 8 8 8 8 o [8n+7:8n] out in cds[0] nibble mode nibble dup direction palu_be n multp1 b multiply a 0x00 k [8n+7:8n] n [8n+7:8n] n [31:24] 8 0 1 bld2 [8n+1] 00 01 10 11 1 1 1 1 1 1 kx n nx n nx 3 0 1 bld2 [8n+1] nx ? nn 8 8 k [8n+7:8n] n [8n+7:8n] rbc [8n+5] 0 1 nibble mode nibble dup direction 9 8 o [8n+7:8n] out in cds[0] palu_be n a -sat multp2 0 1 8 8 8 nibble mode nibble dup direction 0 bld2 [8n+2] pbc [8n+2] 1 0 bld2 [8n+3] pbc [8n+3] 1 preblend command cycle (palu_op [2:0] =110) 8 rbc [8n+7] rbc [8n+6] preblend command cycle (palu_op [2:0] =110) rbc [8n+5]
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 32 3 pixel alu operations figure 3.7 an example of a two-cycle blend operation mclk 12345 67 89 10 palu_en, palu_be, palu_we 11 111 111 111 110 010 or 011 pass_in pass_in pass_out m1049 palu_op 000100 001000 001001 block:word palu_a palu_dq,dx control register pixel buffer data data data preblend data normal data rop/blend blend_2 preblend old data new data invalid
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 33 3 pixel alu operations the mathematic operations performed in the blend unit are summarized in table 3.4. the clamped result is written to the pixel buffer, depending (1) on the pass_out pin, which is the result of internal compare units, and (2) on the pass_in [1:0] pins, which is the pass_out signal from the preceding 3d-ram. *the multiplier is an 8 x 8 unsigned integeter multiplier, so only the lower 8 bits of multiplicand 1 is multiplied with multiplicand 2. however, if nx n , nx 3 , or kx n is ??or if 1.00h is selected, then the calculated multiplier output is ignored and multiplicand 2 is passed through as multiplier output. table 3.4 mathematical operations in blend unit n operand range sources comments multiplicand 1 0.00h ~ 0.ffh (8-bit unsigned) {nx n , n [8n+7:8n] }* source is from palu_dx n and palu_dq [8n+7:8n] pins 0.00h ~ 0.ffh (8-bit unsigned) {nx 3 , n [31:24] }* source is from palu_dx 3 and palu_dq [31:24] pins 0.00h ~ 0.ffh (8-bit unsigned) o [8n+7:8n] source is from the sram pixel buffer 0.00h ~ 0.ffh (8-bit unsigned) {kx n , k [8n+7:8n] }* source is from the internal constant register 1.00h (9-bit constant 1.00h) 1.00h multiplicand 1 greater than 1.00h is clamped to 1.00h multiplicand 2 0 ~ 255 (8-bit unsigned) o [8n+7:8n] source is from the sram pixel buffer ~o [8n+7:8n] source is inverted from o [8n+7:8n] min{n [31:24] , ~o [31:24] } ( a -sat) source is from the a -saturate block in rop/blend unit 3 n [31:24] o [31:24] ~o [31:24] addend ?56 ~ 255 (9-bit signed) {nx n , n [8n+7:8n] } source is from palu_dx n and palu_dq [8n+7:8n] pins ?56 ~ 255 (9-bit signed) {kx n , k [8n+7:8n] } source is from the internal constant register ?56 ~ 255 (9-bit signed) previous addend {nx n , n [8n+7:8n] } or {kx n , k [8n+7:8n] } source is from the previous stage addend (loop back blending) 0 ~ 255 (8-bit unsigned) o [8n+7:8n] source is from the sram pixel buffer 0 ~ 255 (8-bit unsigned) multp1 x multp2 source is from the previous stage multiplier output (loop back blending) intermediate result ?56 ~ 510 (10-bit signed) (multp1 x multp2) + addend clamped result 0 ~ 255 (8-bit unsigned) intermediate result the clamped result is written to the pixel buffer if the pass condition is valid if source > 255, then result = 255 else if source < 0, then result = 0 else result = source
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 34 3 pixel alu operations the ?lpha?value, denoted as as and ad, should be placed at the most significant byte of the respective bus, i.e. as at n [31:24] , which is from palu_dq [31:24] ; and ad at the o [31:24] , which is from the pixel buffer. table 3.5 lists possible multiplicand/addend selections for each opengl blending mode. note that the ?reblend cycle? column only applies to two-cycle blending operations. each table entry represents data for all four blending units, and some entries contain two terms. the first term applies to blending units that are designated for blending color data (in blend units 0, 1, and 2). the second term is for the blend unit operating on the alpha value (unit 3). the individual color terms have been grouped together in the table for simplicity. for example, the term ?d, ad?represents ?d, gd, bd, ad?and ??d, 1?d?represents ??d, 1?d, 1?d, 1 ad.?note that terms such as ??d?and ??d? that are generated inside the 3d-ram are approximated by the 1? complement. for example, the term ??d?is actually ~ad, the bitwise inverse of ad. note also that in alpha_saturate blending, the multiplicand selections for color and alpha are different. there are certainly more ways to do the blending operations than those listed in table 3.5. this list demonstrates that the 3d-ram does support all opengl blending modes. table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend 0, 0 0, 0 na na na 0,0 (from k) cd, ad 0,0 (from dq) 1, 1 0, 0 na na na 0,0 (from k) cd, ad cs, as cd, ad 0, 0 na na na cs, as cd, ad 0,0 (from k) 1?d, 1?d 0, 0 na na na cs, as 1?d, 1?d 0,0 (from k) as, as 0, 0 na na na 0,0 (from k) cd, ad cs*as, as*as na na na cs, as as, as 0,0 (from k) 1?s, 1?s 0, 0 na na na 0,0 (from k) cd, ad cs*(1?s), as*(1?s) ad, ad 0, 0 na na na cs, as ad, ad 0,0 (from k) 1?d, 1?d 0, 0 na na na cs, as 1?d, 1?d 0,0 (from k) f, 1 0, 0 na na na cs, 0 (from k) f, f 0 (from k), as (new r e v .1.03) ck, ak 0,0 na na na 0,0 (from k) cd, ad cs*ck, as*ak (new r e v .1.03) 1?k, 1?k 0,0 na na na 0,0 (from k) cd, ad cs*(1?k), as*(1?k) (new r e v .1.03) ak, ak 0,0 na na na 0,0 (from k) cd, ad cs*ak, as*ak (new r e v .1.03) 1?k, 1?k 0,0 na na na 0,0 (from k) cd, ad cs*(1?k), as*(1?k) (new r e v .1.03) 0, 0 1, 1 na na na 1, 1 cd, ad 0,0 (from k) 1, 1 1, 1 na na na 1, 1 cd, ad cs, as cd, ad 1, 1 na na na cs, as cd, ad cd, ad 1?d, 1?d 1, 1 na na na cs, as 1?d, 1?d cd, ad as, as 1, 1 na na na 1, 1 cd, ad cs*as, as*as na na na cs, as as, as cd, ad 1?s, 1?s 1, 1 na na na 1, 1 cd, ad cs*(1?s), as*(1?s) ad, ad 1, 1 na na na cs, as ad, ad cd, ad cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 35 3 pixel alu operations 1?d, 1?d 1, 1 na na na cs, as 1?d, 1?d cd, ad f, 1 1, 1 na na na cs, 1 f, ad cd, as ck, ak 1, 1 na na na 1, 1 cd, ad cs*ck, as*ak (new r e v .1.03) 1?k, 1?k 1, 1 na na na 1, 1 cd, ad cs*(1?k), as*(1?k) (new r e v .1.03) ak, ak 1, 1 na na na 1, 1 cd, ad cs*ak, as*ak 1?k, 1?k 1, 1 na na na 1, 1 cd, ad cs*(1?k), as*(1?k) (new r e v .1.03) 0, 0 cs, as na na na cs, as cd, ad 0,0 (from k) (new r e v .1.03) 1, 1 cs, as na na na cs, as cd, ad cs, as cd, ad cs, as cs, as cd, ad x cs, as cd, ad loop back(mpy) 1?d, 1?d cs, as cs, as 1?d, 1?d x cs, as cd, ad loop back(mpy) as, as cs, as x x cs*as, as*as cs, as cd, ad loop back(add) cs, as as, as x cs, as cd, ad loop back(mpy) 1?s, 1?s cs, as x x cs*(1?s), as*(1?s) cs, as cd, ad loop back(add) ad, ad cs, as cs, as ad, ad x cs, as cd, ad loop back(mpy) 1?d, 1?d cs, as cs, as 1?d, 1?d x cs, as cd, ad loop back(mpy) f, 1 cs, as cs, x f, x x, as cs, as cd, ad loop back(mpy), loop back(add) ck, ak cs, as x x cs*ck, as*ak cs, as cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k cs, as x x cs*(1?k), as*(1?k) cs, as cd, ad loop back(add) (new r e v .1.03) ak, ak cs, as x x cs*ak, as*ak cs, as cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k cs, as x x cs*(1?k), as*(1?k) cs, as cd, ad loop back(add) (new r e v .1.03) 0, 0 1?s, 1?s na na na 1?s, 1?s cd, ad 0,0 (from k) 1, 1 1?s, 1?s x x cs, as 1?s, 1?s cd, ad loop back(add) cd, ad 1?s, 1?s cs, as cd, ad x 1?s, 1?s cd, ad loop back(mpy) 1?d, 1?d 1?s, 1?s cs, as 1?d, 1?d x 1?s, 1?s cd, ad loop back(mpy) as, as 1?s, 1?s x x cs*as, as*as 1?s, 1?s cd, ad loop back(add) cs, as as, as x 1?s, 1?s cd, ad loop back(mpy) 1?s, 1?s 1?s, 1?s x x cs*(1?s), as*(1?s) 1?s, 1?s cd, ad loop back(add) ad, ad 1?s, 1?s cs, as ad, ad x 1?s, 1?s cd, ad loop back(mpy) 1?d, 1?d 1?s, 1?s cs, as 1?d, 1?d x 1?s, 1?s cd, ad loop back(mpy) f, 1 1?s, 1?s cs, x f, x x, as 1?s, 1?s cd, ad loop back(mpy), loop back(add) ck, ak 1?s, 1?s x x cs*ck, as*ak 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 36 3 pixel alu operations 1?k, 1?k 1?s, 1?s x x cs*(1?k), as*(1?k) 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) ak, ak 1?s, 1?s x x cs*ak, as*ak 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?s, 1?s x x cs*(1?k), as*(1?k) 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) 0, 0 as, as na na na as, as cd, ad 0,0 (from k) 1, 1 as, as na na na as, as cd, ad cs, as cd, ad as, as cs, as cd, ad x as, as cd, ad loop back(mpy) 1?d, 1?d as, as cs, as 1?d, 1?d x as, as cd, ad loop back(mpy) as, as as, as na na na as, x cd, x cs*as, x cs, as as, as x as, as cd, ad loop back(mpy) 1?s, 1?s as, as na na na as, x cd, x cs*(1?s), x x x cs*(1?s), as*(1?s) as, as cd, ad loop back(add) ad, ad as, as cs, as ad, ad x as, as cd, ad loop back(mpy) 1?d, 1?d as, as cs, as 1?d, 1?d x as, as cd, ad loop back(mpy) f, 1 as, as cs, x f, x x, as as, as cd, ad loop back(mpy), loop back(add) ck, ak as, as x x cs*ck, as*ak as, as cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k as, as x x cs*(1?k), as*(1?k) as, as cd, ad loop back(add) (new r e v .1.03) ak, ak as, as x x cs*ak, as*ak as, as cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k as, as x x cs*(1?k), as*(1?k) as, as cd, ad loop back(add) (new r e v .1.03) 0, 0 1?s, 1?s na na na 1?s, 1?s cd, ad 0,0 (from k) 1, 1 1?s, 1?s na na na 1?s, x cd, x cs, x x x cs, as 1?s, 1?s cd, ad loop back(add) cd, ad 1?s, 1?s cs, as cd, ad x 1?s, 1?s cd, ad loop back(mpy) 1?d, 1?d 1?s, 1?s cs, as 1?d, 1?d x 1?s, 1?s cd, ad loop back(mpy) as, as 1?s, 1?s na na na 1?s, x cd, x cs*as, x cs, as as, as x 1?s, x cd, ad loop back(mpy) 1?s, 1?s 1?s, 1?s na na na 1?s, x cd, x cs*(1?s), x x x cs*(1?s), as*(1?s) 1?s, 1?s cd, ad loop back(add) ad, ad 1?s, 1?s cs, as ad, ad x 1?s, 1?s cd, ad loop back(mpy) 1?d, 1?d 1?s, 1?s cs, as 1?d, 1?d x 1?s, 1?s cd, ad loop back(mpy) f, 1 1?s, 1?s cs, x f, x x, as 1?s, 1?s cd, ad loop back(mpy), loop back(add) (new r e v .1.03) ck, ak 1?s, 1?s x x cs*ck, as*ak 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 37 3 pixel alu operations 1?k, 1?k 1?s, 1?s x x cs*(1?k), as*(1?k) 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) ak, ak 1?s, 1?s x x cs*ak, as*ak 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?s, 1?s x x cs*(1?k), as*(1?k) 1?s, 1?s cd, ad loop back(add) (new r e v .1.03) 0, 0 ad, ad na na na cd, ad ad, ad 0,0 (from k) (new r e v .1.03) 1, 1 ad, ad na na na cd, ad ad, ad cs, as (new r e v .1.03) cd, ad ad, ad cs, as cd, ad x cd, ad ad, ad loop back(mpy) 1?d, 1?d ad, ad cs, as 1?d, 1?d x cd, ad ad, ad loop back(mpy) as, as ad, ad na na na cd, ad ad, ad cs*as, as*as 1?s, 1?s ad, ad na na na cd, ad ad, ad cs*(1?s), as*(1?s) ad, ad ad, ad cs, as ad, ad x cd, ad ad, ad loop back(mpy) 1?d, 1?d ad, ad cs, as 1?d, 1?d x cd, ad ad, ad loop back(mpy) f, 1 ad, ad cs, x f, x x, as cd, ad ad, ad loop back(mpy), loop back(add) ck, ak ad, ad x x cs*ck, as*ak cd, ad ad, ad loop back(add) (new r e v .1.03) 1?k, 1?k ad, ad x x cs*(1?k), as*(1?k) cd, ad ad, ad loop back(add) (new r e v .1.03) ak, ak ad, ad x x cs*ak, as*ak cd, ad ad, ad loop back(add) (new r e v .1.03) 1?k, 1?k ad, ad x x cs*(1?k), as*(1?k) cd, ad ad, ad loop back(add) (new r e v .1.03) 0, 0 1?d, 1?d na na na cd, ad 1?d, 1?d 0,0 (from k) 1, 1 1?d, 1?d na na na cd, ad 1?d, 1?d cs, as cd, ad 1?d, 1?d cs, as cd, ad x cd, ad 1?d, 1?d loop back(mpy) 1?d, 1?d 1?d, 1?d cs, as 1?d, 1?d x cd, ad 1?d, 1?d loop back(mpy) as, as 1?d, 1?d na na na cd, ad 1?d, 1?d cs*as, as*as 1?s, 1?s 1?d, 1?d na na na cd, ad 1?d, 1?d cs*(1?s), as*(1?s) ad, ad 1?d, 1?d cs, as ad, ad x cd, ad 1?d, 1?d loop back(mpy) 1?d, 1?d 1?d, 1?d cs, as 1?d, 1?d x cd, ad 1?d, 1?d loop back(mpy) f, 1 1?d, 1?d cs, x f, x x, as cd, ad 1?d, 1?d loop back(mpy), loop back(add) ck, ak 1?d, 1?d x x cs*ck, as*ak cd, ad 1?d, 1?d loop back(add) (new r e v .1.03) 1?k, 1?k 1?d, 1?d x x cs*(1?k), as*(1?k) cd, ad 1?d, 1?d loop back(add) (new r e v .1.03) ak, ak 1?d, 1?d x x cs*ak, as*ak cd, ad 1?d, 1?d loop back(add) (new r e v .1.03) 1?k, 1?k 1?d, 1?d x x cs*(1?k), as*(1?k) cd, ad 1?d, 1?d loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 38 3 pixel alu operations 0, 0 ck, ak na na na ck, ak cd, ad 0,0 (from k) (new r e v .1.03) 1, 1 ck, ak x x cs, as ck, ak cd, ad loop back(add) (new r e v .1.03) cd, ad ck, ak cs, as cd, ad x ck, ak cd, ad loop back(mpy) (new r e v .1.03) 1-cd, 1-ad ck, ak cs, as 1?d, 1?d x ck, ak cd, ad loop back(mpy) (new r e v .1.03) as, as ck, ak x x cs*as, as*as ck, ak cd, ad loop back(add) (new r e v .1.03) 1?s, 1?s ck, ak x x cs*(1?s), as*(1?s) ck, ak cd, ad loop back(add) (new r e v .1.03) ad, ad ck, ak cs, as ad, ad x ck, ak cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d ck, ak cs, as 1?d, 1?d x ck, ak cd, ad loop back(mpy) (new r e v .1.03) f, 1 ck, ak cs, x f, x x, as ck, ak cd, ad loop back(mpy), loop back(add) (new r e v .1.03) ck, ak ck, ak x x cs*ck, as*ak ck, ak cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k ck, ak x x cs*(1?k), as*(1?k) ck, ak cd, ad loop back(add) (new r e v .1.03) ak, ak ck, ak x x cs*ak, as*ak ck, ak cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k ck, ak x x cs*(1?k), as*(1?k) ck, ak cd, ad loop back(add) (new r e v .1.03) 0, 0 1?k, 1?k na na na 1?k, 1?k cd, ad 0,0 (from k) (new r e v .1.03) 1, 1 1?k, 1?k x x cs, as 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) cd, ad 1?k, 1?k cs, as cd, ad x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d 1?k, 1?k cs, as 1?d, 1?d x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) as, as 1?k, 1?k x x cs*as, as*as 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?s, 1?s 1?k, 1?k x x cs*(1?s), as*(1?s) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) ad, ad 1?k, 1?k cs, as ad, ad x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d 1?k, 1?k cs, as 1?d, 1?d x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) f, 1 1?k, 1?k cs, x f, x x, as 1?k, 1?k cd, ad loop back(mpy), loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 39 3 pixel alu operations ck, ak 1?k, 1?k x x cs*ck, as*ak 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?k, 1?k x x cs*(1?k), as*(1?k) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) ak, ak 1?k, 1?k x x cs*ak, as*ak 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?k, 1?k x x cs*(1?k), as*(1?k) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 0, 0 ak, ak na na na ak, ak cd, ad 0,0 (from k) (new r e v .1.03) 1, 1 ak, ak x x cs, as ak, ak cd, ad loop back(add) (new r e v .1.03) cd, ad ak, ak cs, as cd, ad x ak, ak cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d ak, ak cs, as 1?d, 1?d x ak, ak cd, ad loop back(mpy) (new r e v .1.03) as, as ak, ak x x cs*as, as*as ak, ak cd, ad loop back(add) (new r e v .1.03) 1?s, 1?s ak, ak x x cs*(1?s), as*(1?s) ak, ak cd, ad loop back(add) (new r e v .1.03) ad, ad ak, ak cs, as ad, ad x ak, ak cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d ak, ak cs, as 1?d, 1?d x ak, ak cd, ad loop back(mpy) (new r e v .1.03) f, 1 ak, ak cs, 1 f, as x ak, ak cd, ad loop back(mpy) (new r e v .1.03) ck, ak ak, ak x x cs*ck, as*ak ak, ak cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k ak, ak x x cs*(1?k), as*(1?k) ak, ak cd, ad loop back(add) (new r e v .1.03) ak, ak ak, ak x x cs*ak, as*ak ak, ak cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k ak, ak x x cs*(1?k), as*(1?k) ak, ak cd, ad loop back(add) (new r e v .1.03) 0, 0 1?k, 1?k na na na 1?k, 1?k cd, ad 0,0 (from k) (new r e v .1.03) 1, 1 1?k, 1?k x x cs, as 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) cd, ad 1?k, 1?k cs, as cd, ad x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d 1?k, 1?k cs, as 1?d, 1?d x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) as, as 1?k, 1?k x x cs*as, as*as 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?s, 1?s 1?k, 1?k x x cs*(1?s), as*(1?s) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 40 3 pixel alu operations stencil modes (new) the stencil buffer in a 3d graphics system may be used to restrict drawing to a certain portion of the screen, just as a cardboard stencil may be used with a can of spray paint to make precise, painted images. 3d-ram offers a broad range of support for on-chip stencil hardware acceleration. there are two distinct stencil modes supported inside the 3d-ram. the opengl stencil mode is fully compliant with the opengl specification that allows for any number of stencil planes from 0 through 8. the 3d-ram also offers the decal stencil mode, which is compatible with the previous generation of the 3d-ram, m5m410092a. these two stencil modes should not be used at the same time. if the opengl stencil mode is being used, the decal stencil mode should be disabled. similarly, the opengl stencil mode should be disabled when using the decal stencil mode. however, the 3d-ram chip itself does not check or inhibit such conflict, and it is the controller? responsibility to ensure that such conflict does not occur. opengl stencil mode operations (new) this stencil mode provides fully compliant opengl stencil operations. the 3d-ram stencil features are implemented in rop/blend unit 3 so that bits [31:24] of the 32-bit alu unit are available as stencil planes. these 8 bits are bitwise enabled, so that any number of stencil planes from 0 through 8 may be used. there are two parts in the data flow of this stencil mode, as illustrated in figure 3.8, and the paragraphs that follow refer to the blocks in this figure. the first part involves a stencil function which compares the old stencil data to the reference data stf.ref which may be either the most significant byte data at the palu_dq pins or the most significnat byte in the constant source register. the second part involves the execution of a certain stencil operation on the old stencil data or the stf.ref data, based on the results of the stencil and magnitude comparison functions. ad, ad 1?k, 1?k cs, as ad, ad x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) 1?d, 1?d 1?k, 1?k cs, as 1?d, 1?d x 1?k, 1?k cd, ad loop back(mpy) (new r e v .1.03) f, 1 1?k, 1?k cs, x f, x x, as 1?k, 1?k cd, ad loop back(mpy), loop back(add) (new r e v .1.03) ck, ak 1?k, 1?k x x cs*ck, as*ak 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?k, 1?k x x cs*(1?k), as*(1?k) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) ak, ak 1?k, 1?k x x cs*ak, as*ak 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) 1?k, 1?k 1?k, 1?k x x cs*(1?k), as*(1?k) 1?k, 1?k cd, ad loop back(add) (new r e v .1.03) table 3.5 multiplicand/addend selection for each opengl blending factor pairs blending fractions preblend cycle normal cycle sfactor dfactor multp1 multp2 addend multp1 multp2 addend cs=rs,gs,bs; cd=rd,gd,bd; x=don? care; na =not applicable; f=min(as,1?d); *=arithmetic multiplication mpy=multiplier result; add=addend term; k=constant source register; dq=palu_dq pins or {palu_dx, palu_dq} pins
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 41 3 pixel alu operations figure 3.8 operations of opengl stencil mode the stencil function block compares the magnitude of the stencil reference data, stf.ref, to the old stencil data that is read from the pixel buffer. a mask for the stencil data is available which provides the capability to ignore certain bits in the stencil data comparison. the exact type of comparison executed in the stencil function block is defined by stf.func [2:0] . the options for this comparison are: gl_always, gl_greater, gl_equal, gl_gequal, gl_never, gl_lequal, gl_notequal, and gl_less, as defined in table 3.29. the stencil operation block considers the results from the stencil function and magnitude compare and alters the stencil data stored in the pixel buffer based on the settings of the stencil control register. the stencil operation block can be set to zero, keep, invert, replace, increment, or decrement the stencil planes. the actions taken by the stencil operation block for three different cases are defined in table 3.26 through table 3.28. separately, the sram write enable logic block considers the pass_in pins, the match compare result, the magnitude compare result, and the stencil function result. if either of the pass_in [1:0] pins is ??or the match compare result is ?? the pixel buffer will not be updated and the pass_out pin will be ?? if both pass_in [1:0] pins are ??and the match compare passes, then depending on the results of the stencil function and the magnitude compare, the sram write enable logic block determines whether to write to the pixel buffer and what the state of the pass_out pin should be. it is helpful to point out that for the stencil support to be useful in the overall graphics processing, the overriding conditions above and beyond the results of the stencil comparison functions and depth test are the states of two pass_in pins and the match compare result. for example, it may be desired to restrict stencil functions and stencil operations to only a certain window on the display monitor based on the result of the comparison of window id bits, which may or may not be stored on the same 3d-ram chips as the z buffer and stencil function palu_dq [31:24] old [31:24] stf.mask [7:0] stf.ref old stf.mask stf.func [2:0] magnitude compare match compare stencil operation stop.zpass [2:0] pass_in [1:0] byte 3 to pass_out write enable bytes 0,1,2 write enable byte 3 pixel buffer 8 / rop unit 3 result stop.zfail [2:0] stop.fail [2:0] logic write enable sram st.enable [7:0] k [31:24] stc [19] 0 1
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 42 3 pixel alu operations the stencil planes. the pseudo code below summarizes the above explanations in a concise format. note that the expression ?t.test(stf)?refers to an opengl stencil test based on the stencil function selected in the glstencil_func. the glstencil_func should set and reset the bits in the 3d-ram stencil planes register and stencil control register. note also that the above pseudo code assumes that the stencil operations stop.fail, stop.zpass, and stop.zfail take st.enable [7:0] into account for proper execution of the gl_decr and gl_incr operations, with the correct bit alignment and underflow and overflow clamping. the details of the calculations are not explicitly shown. see also page 66 for the paragraph describing the bit field st.enable.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 43 3 pixel alu operations if (stc[19]==0) stf.ref=palu_dq[31:24] else stf.ref=k[31:24] if( ((!pins[0]||pass_in[0]) && (!pins[8]||pass_in[1]) && match_comp)==0 ) { pixel_buffer_write_enable_byte[3:0]=0000b pass_out=0 } else /* match test passes and both pass_in */ /* pins true */ { if ( st.test(stf)==0 ) { pixel_buffer_write_enable_byte[3:0] =1000b for bits disabled by st.enable[7:0] pixel_buffer_data_in[31:24]=old[31:24] for bits enabled by st.enable[7:0] pixel_buffer_data_in[31:24] =stop.fail(stf.ref, old[31:24]) pass_out=0 } else /* stencil test passes */ { if ( magnitude_comp==1 ) { pixel_buffer_write_enable_byte[3:0] =1111b for bits disabled by st.enable[7:0] pixel_buffer_data_in[31:24] =rop(old[31:24]) for bits enabled by st.enabled[7:0] pixel_buffer_data_in[31:24] =stop.zpass(stf.ref, old[31:24]) pass_out=1 } else /* depth test fails */ { pixel_buffer_write_enable_byte[3:0] =1000b for bits disabled by st.enable[7:0] pixel_buffer_data_in[31:24] =old[31:24] for bits enabled by st.enable[7:0] pixel_buffer_data_in[31:24] =stop.zfail(stf.ref, old[31:24]) pass_out=0 } } /* depth test */ } /* stencil test */ } /* pass_in/match test */ list 3.1 pseudo code for opengl stencil operations restrictions on opengl stencil mode there are several restrictions that must be met for the opengl stencil mode to function correctly. the restrictions are listed below. in order for increment and decre- ment operations to perform correctly, it is necessary that the enabled stencil bits be in one contiguous group. the z planes must be stored in the same 3d-ram chip as the stencil planes. the magnitude compare unit must be used for the depth test. if any stencil bits are enabled, the rop/ blend unit 3 cannot be used for blending, i.e. rbc [27] must be ?? the source for the stencil reference value (stf.ref) is selected by the stencil control register bit 19 (stc [19] ). when stc [19] is 0, palu_dq [31:24] is stf.ref, and when stc [19] is 1, bits 31 through 24 of the con- stant source register (k [31:24] ) is stf.ref. stf.ref is used in both stencil test and stencil operations at the same time. decal stencil mode operations this decal stencil mode is offered to provide compatibility wiith the stencil mode of the previous generation of the 3d-ram, m5m410092a. by setting the match mask register and the magnitude mask register, the user has flexible plane depths for the stencil buffer and the zbuffer, respectively. the match compare unit, as in the normal, non-stencil mode, supports never, always, equal, and notequal stencil test functions, while the byte-wide rop units support only the logical stencil update operations, namely keep, zero, replace, and invert, plus the additional one operation; the arithmetic stencil update operations increment and decrement are not implemented in the decal/ invert stencil mode.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 44 3 pixel alu operations the decal stencil mode may be selected by setting the compare control register bit 10 to ?? in this mode, the internal pixel buffer write enable for a stateful write is no longer solely controlled by pass_in [1:0] and pass_out. the added condition to the write enable signal generation is the output of the match compare unit, which now overwrites the pass_out and the inverted output of the match compare unit is logically anded with pass_in [1:0] to set the pixel buffer write enable signal. when the stencil buffer and the z buffer are placed on the same 3d-ram chip, the match compare unit performs the stencil match test and the magnitude compare unit performs the z (depth) compare test. when both tests pass, the pass_out can enable the update of the color buffer in other 3d-ram chips through their pass_in pins. in the meantime, the stencil and z buffers are also updated internally. if the stencil match test fails, then only the stencil and z buffer may be updated internally and the color buffer on other 3d-ram chips are left unchanged. invert stencil mode operations warning! the invert stencil mode is removed from this device M5M410092B, and an incompatibility exists between this device M5M410092B and its previous generation, m5m410092a, with respect to this function. 16-bit color mode there are two popular 16-bit color modes: the (4, 4, 4, 4) mode with 4 bits each for alpha, red, green, and blue; and the (5, 6, 5, 0) mode with 5- bit red, 6-bit green, and 5-bit blue. the 16-bit on- chip blending functions are only for (4, 4, 4, 4) mode. if not invoking any alu operation, the 3d-ram can be used to store (5, 6, 5, 0) 16-bit data. in the normal (8, 8, 8, 8) color mode, alpha must be placed in the most significant byte palu_dq [31:24] due to the special circuits to handle the case of alpha-saturate in blend unit 3. red, green, and blue color data may be placed on the other three bytes palu_dq [23:0] in any permutation. the four rop/blend units operate on the four color components and write the results back to the pixel buffer. in the (4, 4, 4, 4) 16-bit mode, the 32-bit bus within the 3d-ram is used for double buffering the 16-bit (4, 4, 4, 4) data. for example, buffer a data is placed on the upper nibble of each byte: palu_dq [31:28] for alpha, and palu_dq [23:20] , palu_dq [15:12] , and palu_dq [7:4] for red, green, and blue; and buffer b data is placed on the lower nibble of each byte: palu_dq [27:24] for alpha, and palu_dq [19:16] , palu_dq [11:8] , and palu_dq [3:0] for red, green, and blue. note again that the alpha data of both buffers a and b can only be stored in the most significant byte. byte enable and nibble control let nbln represent the nibble data on the palu_dq [4n+3:4n] pins, for n = 0 through 7. in the (8, 8, 8, 8) mode, palu_be [3] enables nbl7 and nbl6; palu_be [2] enables nbl5 and nbl4; palu_be [1] enables nbl3 and nbl2; palu_be [0] enables nbl1 and nbl0. an example of this arrangement is illustrated in figure 3.9 below. table 3.6 stateful write enable in decal stencil mode stencil mode? stateful write enable pass_out no pass_in [1:0] && pass_out magpass && matchpass yes pass_in [1:0] && (pass_out + ~matchpass) magpass && matchpass
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 45 3 pixel alu operations figure 3.9 data mapping in the (8, 8, 8, 8) color mode figure 3.10 data mapping in the (4, 4, 4, 4) color mode in the (4, 4, 4, 4) mode, palu_be [3] enables nbl7 and nbl5; palu_be [2] enables nbl3 and nbl1; palu_be [1] enables nbl6 and nbl4; palu_be [0] enables nbl2 and nbl0. from the rendering controller output, palu_be [3,2] controls (a, r, g, b) in buffer a, while palu_be [1,0] controls (a, r, g, b) in buffer b. this byte enable assignment allows for support of (4, 4, 4, 4) and (5, 6, 5, 0) data formats with identical palu_be controls from the controller. figure 3.10 illustrates how data is mapped in the (4, 4, 4, 4) mode. palu_be [0] palu_be [1] palu_be [2] palu_be [3] r g b a r g b a r g b a dq pixel alu pixel buffer dq for a for r for g for b ?rite ?ead 0 31 0 31 write control read control palu_be [0] palu_be [1] palu_be [2] palu_be [3] 8 8 8 8 palu_be [0] palu_be [2] palu_be [0] palu_be [2] palu_be [1] palu_be [3] palu_be [1] palu_be [3] r a r b g a g b b a b b a a a b r a r b g a g b b a b b a a a b r a r b g a g b b a b b a a a b dq pixel alu pixel buffer dq ?rite ?ead 0 31 0 31 write control read control palu_be [0] palu_be [2] palu_be [0] palu_be [2] palu_be [1] palu_be [3] palu_be [1] palu_be [3] for a for r for g for b
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 46 3 pixel alu operations the solid black arrows correpond to data flow of buffer a, and the gray arrows correspond to data flow of buffer b. during write operations, the 4-bit data may be duplicated or padded with zeros in the lower nibble when processed by the byte-wide pixel alu; after the pixel alu processing, the lower nibble of the resulting 8-bit data is truncated to form a 4-bit data before written into the pixel buffer. to store 16-bit color in (5, 6, 5, 0) data format, the 3d-ram is programmed to operate in (8, 8, 8, 8) mode. the (r, g, b) data for buffer a is stored in nbl7 to nbl4, which are controlled by palu_be [3:2] . the (r, g, b) data for buffer b is stored in nbl3 to nbl0, which are controlled by palu_be [1:0] . the data mapping for the (5, 6, 5, 0) color mode is shown in figure 3.11 below. in summary, when the controller is in 16-bit color mode, either (4, 4, 4, 4) or (5, 6, 5, 0), asserting palu_be [3:2] controls buffer a read and write operations; asserting palu_be [1:0] controls buffer b read and write operations. because the same blending circuits for (4, 4, 4, 4) mode are used for both color buffers, buffer a and b stateful writes cannot occur on the same clock cycle. figure 3.11 data mapping in the (5, 6, 5, 0) color mode r b r a g b g a b b b a dq pixel alu pixel buffer dq ?rite ?ead 0 31 0 31 write control read control palu_be [0] palu_be [2] palu_be [1] palu_be [3] (pass through) palu_be [0] palu_be [2] palu_be [1] palu_be [3] 5 5 5 5 6 6 r b r a g b g a b b b a r b r a g b g a b b b a table 3.7 byte enable controls and color data placement in (4,4,4,4) mode palu_dq [31:28] [27:24] [23:20] [19:16] [15:12] [11:8] [7:4] [3:0] nbl 76543210 palu_be 31312020 color data a a a b r a r b g a g b b a b b
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 47 3 pixel alu operations the following is how palu_be [3:0] can be used in (4, 4, 4, 4) mode: n in read data, stateless initial/normal write operations palu_be [3:0] may be any combination (decoded as in table 3.7) n in stateful initial/normal write, initiate two-cycle blending operations palu_be [3:0] controls either buffer a (as 1100b) data or buffer b (as 0011b) data, but not both (i.e. not 1111b) warning! during a stateful write or initiate two-cycle blending operation, palu_be [3:0] should be set to only enable either buffer a or buffer b, but not both. enabling both buffers would result in buffer a and buffer b datausing the same blending circuits at the same time and therefore in undefined resulting data. in other words, in these operations palu_be [3] and palu_be [1] cannot be enabled at the same time. similarly, palu_be [2] and palu_be [0] cannot be enabled at the same time. n palu_be mapping to the dirty tag in stateful/stateless writes operations either palu_be [0] or palu_be [2] sets the dirty_tag bits corresponding to both byte 0 and byte 1 of the addressed data word either palu_be [1] or palu_be [3] sets the dirty_tag bits corresponding to both byte 2 and byte 3 of the addressed data word n write control register(s) same decoding as in (8, 8, 8, 8) mode n read id register same decoding as in (8, 8, 8, 8) mode) n (4, 4, 4, 4) mode has no effect on the use of the dirty tag (masked/unmasked write block from the pixel buffer to a dram bank); it only affects the writing of the dirty tag during stateful/stateless initial/ normal writes operations. use of dirty tags in (4, 4, 4, 4) 16-bit mode the mapping for the ?eplace dirty tag?and ?r dirty tag?operations in (4, 4, 4, 4) mode is the same as in (8, 8, 8, 8) mode, only that the palu_be control decoding is for the (4, 4, 4, 4) mode. the dirty tag functions in (4, 4, 4, 4) mode are summarized in table 3.9. the mapping of the palu_be pins to the dirty tags is illustrated in figure 3.12. table 3.8 byte enable controls and color data placement in (5, 6, 5, 0) mode palu_dq [31:28] [27:24 [23:20] [19:16] [15:12 [11:8] [7:4] [3:0] nbl 765 43210 palu_be 3 2 1 0 color data r a, g a, b a r b, g b, b b
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 48 3 pixel alu operations table 3.9 pixel alu operations involving dirty tags in (4, 4, 4, 4) mode pixel operation pixel data new dirty tag contents (stateful/stateless) normal data write write to buffer a from palu_dq pins (per palu_be [3:2] pins) the dirty tag bits for bytes 3 and 2 are ored with palu_be [3] ; the dirty tag bits for bytes 1 and 0 are ored with palu_be [2] ; the other 28 dirty tag bits are unchanged. (stateful/stateless) normal data write write to buffer b from palu_dq pins (per palu_be [1:0] pins) the dirty tag bits for bytes 3 and 2 are ored with palu_be [1] ; the dirty tag bits for bytes 1 and 0 are ored with palu_be [0] ; the other 28 dirty tag bits are unchanged. (stateful/stateless) initial data write write to buffer a from palu_dq pins (per palu_be [3:2] pins) palu_be [3] is written to the dirty tag bits for bytes 3 and 2; palu_be [2] is written to the dirty tag bits for bytes 1 and 0; ? is written to the 28 unaddressed dirty tag bits. (stateful/stateless) initial data write write to buffer b from palu_dq pins (per palu_be [1:0] pins) palu_be [1] is written to the dirty tag bits for bytes 3 and 2; palu_be [0] is written to the dirty tag bits for bytes 1 and 0; ? is written to the 28 unaddressed dirty tag bits. replace dirty tag unchanged palu_dq [31:0] replaces all 32 dirtyu tag bits. or dirty tag unchanged all 32 dirty tag bits are ored with palu_dq [31:0] .
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 49 3 pixel alu operations figure 3.12 palu_be mapping to dirty tags for (4,4,4,4) mode 08 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 23 45 67 0 1 2 3 4 5 67 0 81624 1 9 1725 2 10 18 26 3 11 19 27 4 12 20 28 5 13 21 29 6 14 22 30 7 15 23 31 nbl0 nbl1 nbl2 nbl3 nbl4 nbl5 nbl6 nbl7 palu_be[0] palu_be[1] palu_be[2] palu_be[3] palu_be[0] palu_be[1] palu_be[3] palu_be[2] 0 8 16 24 a 256-bit data block a 32-bit dirty tag
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 50 3 pixel alu operations 4-bit to 8-bit expansion for pixel alu blending the pixel alu blending function operates on 8-bit components. to implement the 8-bit blending operation on a 4-bit color component, it is necessary to expand the 4-bit data (either from palu_dq pins or from pixel buffer) to an 8-bit operand. for the addend term, the 4-bit component is mapped to the upper nibble and zeros are padded into the lower nibble. for the multiplicand terms, simply padding zeros in the lower nibble would result in an incorrect pixel value due to computational error from short bit length representation. by duplicating the 4-bit data in both the upper and lower nibbles, we can avoid corrupting the pixel value. this effect can be illustrated with the following two examples, when the blending factor alpha is equal to ?? as we can see from the table 3.11, the duplication of the upper nibble and the lower nibble allows the color data to blend with ?? without invoking an extra bit in the circuit. after the blending has occurred, the upper nibble is mapped back to the correct nibble in the pixel buffer. this mapping is illustrated in figure 3.10. on the other hand, in the (4, 4, 4, 4) mode, the 4- bit addend, whether supplied from the palu_dq pins in both the preblend cycle and normal cycle or looped back from the multiplier or the addend in the preblend cycle, will always be paddd with 0000b in the least significant bits to minimize the error due to incorrect round-up. table 3.10 blending color value and alpha, with alpha = ?? zeros padded at lower nibble color value blending (colorxalpha) arithmetic result multiplier output e e0 x f0 d200 d 2 20 x f0 1a00 1 table 3.11 blending color value and alpha, with alpha = ?? duplication at lower nibble color value blending (colorxalpha) arithmetic result multiplier output e ee x ff ed12 e 2 22 x ff 21de 2
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 51 3 pixel alu operations dual compare unit a functional block diagram of the dual compare units is shown in figure 3.13. both match compare and magnitude compare are performed in parallel. the match mask and magnitude mask define which bits of the 32-bit word will be compared and which will be ?on? care.?one of the sources is always the old data (?? from the pixel buffer. the other source is independently selectable between the new data (?? from the palu_dq pins and the register data (?? from the constant register. the results of both match compare and magnitude compare operations are logically anded together to generate the pass_out pin. the external pass_in [1:0] and internal pass_out are then logically anded together to generate the write enable signal to the pixel buffer. (new) in the normal mode (all stencil planes are disabled, stencil function=always pass), the results of match compare and magnitude compare operations are logically anded to generate the pass_out signal. the external pass_in [1:0] and internal pass_out are then logically anded together to generate the write enable signal to the pixel buffer. if opengl stencil mode is enabled, the pass_out and write enable generation are affected by the stencil function result. note that the decal stencil mode logic is not shown below. see the sectoion on ?tencil modes?on page 40 for more details on how stencil affects pass_out and pixel buffer write enable figure 3.13 block diagram of the dual compare unit (pipeline stages are not shown) (new r e v .1.03) match compare magnitude compare n [31:0] 3 32 o [31:0] 32 32 32 32 2 7 k [31:0] 32 to pixel buffer load match mask load magnitude mask load compare control write enable (!pins [0] || pass_in [1] ) && (!pins [8] || pass_in [0] ) pass_out stencil function result bytes 0,1 2 to pixel buffer write enable byte 3 stencil enable 10 1 2
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 52 3 pixel alu operations pipelining the 3d-ram pixel alu pipeline is designed so that the rendering controller can issue read and write operations with minimal cycle time. this is achieved by having all operations conform to a uniform 7-stage pipeline. a sequence of pixel alu write operations may be issued consecutively, one write operation per cycle. the write opcode and address are sampled by the rising edge of mclk in one cycle, and the write data is loaded by the rising edge of the next mclk. to the rendering controller, n consecutive writes take only n+1 cycles. register writes do not affect operations issued in previous cycles; register writes always affect operations issued in subsequent cycles. read operations may be issued consecutively, one read operation per two cycles. specifically, all read operations require that the same address must be stable for at least two rising edges of mclk plus the set-up time. (see figure 8.4 for details.) due to the pin output delay in the worst case, the read data may be available for sampling by the rendering controller in the second cycle after the read instruction. in other words, to the rendering controller, n consecutive read operations take 2(n+1) cycles plus access time. a read operation may be issued immediately after a write operation without any delay cycles. an idle cycle will be automatically generated by the 3d-ram chip on the palu_dq pins between a write operation and the subsequent read operation. at least two nop cycles must be inserted between a read operation and a subsequent write operation. the two nops are needed to guarantee one idle cycle on the palu_dq pins between the read data and subsequent write data. figure 3.14 illustrates the above statements on the pipeline flow of pixel alu read/write operations. table 3.12 pixel alu and pixel buffer operation pipeline stage external activities internal activities 1 operation speci?d on palu_en, palu_we, palu_op, palu_a, and palu_be pins 2 write data on palu_dq and palu_dx pins if write operation read pixel buffer. decode operation. 3 read data on palu_dq pins if read operation first stage of rop/blend and compare units 4 second stage of rop/blend and compare units 5 third stage of rop/blend and compare units 6 compare result transferred from pass_out pin to pass_in pin fourth stage of rop/blend units; write enable generated, if write operation. 7 write result to pixel buffer and dirty tags if allowed. 8 hit pin changes, if the write operation is a stateful data write and it is successful.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 53 3 pixel alu operations the pipeline flow of the rop/blend units and the dual compare units of the pixel alu and the pixel buffer is illustrated in figure 3.15. it is helpful to point out that the dotted lines show the boundaries of the pipeline stages and the numbers in the square boxes indicate the pipeline stages. a pipeline stage begins with the rising edge of mclk and ends just prior to the next rising edge. for example, palu_be and palu_a pins are presented to 3d-ram by the rendering controller in the pipeline stage 1. on the rising edge of mclk, they are latched in and the pipeline stage 2 starts. beginning in the pipeline stage 3, data from the pixel buffer becomes available either as output to the palu_dq pins during an alu read operation or as input to the dual compare unit and the rop/blend units during an alu write operation. if it is a write, then palu_dq must be presented with data from the rendering controller in the pipeline stage 2, to be latched in and used by the pixel alu, together with the data from the pixel buffer. figure 3.14 example of pixel port read/write operations to satisfy the pipeline ?w mclk 1 0 palu_a, palu_op, palu_be, palu_we, palu_en palu_dq, palu_dx pass_out 23456 r1 r2 r3 r4 w1 w2 w3 w4 w5 w6 w7 w8 7 8 9 10 11 12 13 14 15 pass c pass e pass f pass d read a wc read b we nop wf wd read g a data c b data e data f data d g hit m1029
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 54 3 pixel alu operations figure 3.15 pixel alu and pixel buffer block diagram with pipeline ?w palu_a palu_dq palu_be pass_out pass_in read addr write addr read data write data enables pixel buffer 3 3 2 2 4 5 4 5 6 2 6 2 3 4 5 1 1 3 7 6 3 6 2 compare unit rop/blend units
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 55 3 pixel alu operations the picking logic the block diagram of the picking logic is shown again in figure 3.16. at the beginning, the picking logic should be enabled and the hit flag should be cleared. this is done either by asserting the reset pin low or by writing the data eh into byte 3 of the compare control register. it is helpful to note that writing ??to bits 27 and 25 of the compare control register effectively generates one-shots to load the pick enable flag and the hit flag, respectively. the user will not need to perform a second register write operation to reset bits 27 and 25 to ?? the hit pin will be set to high after seven cycles (corresponding to the pipeline stage 8, as in table 3.12). in the figure below, this is indicated by the number 8 in the square box above the hit pin label. a sequence of stateful data write operations may be issued immediately after the register writing, and their effects will take place after the hit pin is set high by this initial register writing. if any of the stateful data writes in the sequence causes the on-chip and off-chip comparison tests to pass (pass_in [1:0] and pass_out are both high at the pipeline stage 6), the hit pin is set low until the hit flag is cleared by writing ?0?into bits 25 and 24 of the compare control register. see also figure 8.6, ?ick logic timing? for an illustration of the operations described in this section. figure 3.16 block diagram of the picking logic 0 1 d25 q d q d q d d24 0 1 d27 q d d26 hit flag pick enable set hit flag (open drain) compare control register stateful_we pass_in pass_out 7 8 m1040 hit
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 56 3 pixel alu operations operations of the pixel alu all operations that involve the pixel alu and the pixel buffer but not the dram array are collectively referred to as pixel alu operations. table 3.13 summarizes the pixel alu operations. there are two categories of pixel alu operations: register operations and pixel data operations. register operations include reading the identification register and writing the control registers; in this case, the register is specified by the palu_a pins. pixel data operations include reading data from the pixel buffer, writing data to the pixel buffer in four different modes, replacing dirty tag data, and changing dirty tag data with or function; in this case, the block address is assigned by the palu_a [5:3] pins, and the word address is assigned by the palu_a [2:0] pin. to support the all blending operations specified in the opengl specification version 1.1 (december 21, 1995), a new pixel alu command, called initial two-cycle blending, is added to this implementation of 3d-ram. note 1: one mclk cycle must be inserted between the write to the color depth select register and one of the following alu opera tions: read pixel buffer, stateful initial data write, stateful normal data write, and initial two-cycle blending. see also the decrip tion on the color depth select register. (new rev.1.03) table 3.13 pixel alu operation encoding palu_en palu_we palu_op palu_a operation 00 nop 10 nop 01 nop 11 0 000 block:word read pixel buffer (note 1) 11 0 001 reserved 11 0 010 reserved 11 0 011 reserved 11 0 100 reserved 11 0 101 reserved 11 0 110 reserved 11 0 111 000111 read identi?ation register 11 1 000 block:word stateless initial data write 11 1 001 block:word stateless normal data write 11 1 010 block:word stateful initial data write (note 1) 11 1 011 block:word stateful normal data write (note 1) 11 1 100 block:xxx replace dirty tag 11 1 101 block:xxx or dirty tag 11 1 110 block:word initiate two-cycle blending (new) (note 1) 11 1 111 register write control registers (note 1)
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 57 3 pixel alu operations register operations there are fourteen registers in the 3d-ram. their encoding is shown in table 3.14. among these registers, the identi?ation register is read-only, and all other registers are write-only. all registers are 32 bits wide except the constant source register, which is 36 bits. the write-only registers are loaded from the palu_dq [31:0] . in the case of the 36-bit constant source register, the palu_dx [3:0] pins specify the most signi?ant four bits. the operations launched in the previous cycles are never affected by the current register load. the operations launched in the following cycles are always affected by the register load. the palu_be [3..0] pins apply to all register read and write operations. palu_be 0 enables writes to bits 7 through 0; palu_be 1 enables writes to bits 15 through 8; palu_be 2 enables writes to bits 23 through 16; palu_be 3 enables writes to bits 31 through 24. finally, during the stateful data write modes, all bits in these registers are fully effective; during the stateless data write modes, all register bits are ignored except that the color depth select register, the bits controlling the picking logic, and the stencil control register maintain some of their special functions. please refer to the description of these bits for more details. table 3.14 register address encoding palu_a register mnemonic type reset value stateless mode 000 000 plane mask pm write only ffff ffffh not applicable 000 001 constant source csr write only 0 0000 0000h not applicable 000 010 match mask matmask write only 0000 0000h not applicable 000 011 magnitude mask magmask write only 0000 0000h not applicable 000 100 rop/blend control rbc write only 0303 0303h 0303 0303h 000 101 compare control ccr write only 0a00 0000h 0000 0000h 000 110 write address control wac write only 0000 0000h 0000 0000h 000 111 identi?ation* id read only 0130 a039h not applicable 001 000 blend_2 control (new) bld2 write only 0000 0000h 0000 0000h 001 001 preblend control (new) pbc write only 0000 0000h 0000 0000h 001 010 stencil planes (new) stp write only 00ff 0000h 00ff 0000h 001 011 stencil control (new) stc write only 3330 0000h 3330 0000h 001 110 pass_ins select (new) pins write only 0000 0100h not applicable 001 111 color depth select (new) cds write only 0000 0000h programmed value *note: reset value for identi?ation register is for version 0
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 58 3 pixel alu operations identification register (id [31:0] ) the read-only identification register contains the manufacturer identification code (id), part number code, and version code in the format shown in figure 3.17. the manufacturer id is 01ch for mitsubishi electronics. the part number is read as 130ah for M5M410092B. bit 0 is always ?? so for version 0, this identifi-cation register should be read as 0130 a039h. plane mask register (pm [31:0] ) this register affects both the stateful data writes of the pixel alu operations and the masked write block (mwb) of the dram operations. the effect is simultaneous on both types of operations. therefore, the user must exercise caution to ensure the desired plane masking is achieved when such concurrency between the pixel alu and the dram array is exploited. for the stateful data writes, each bit of the plane mask register is a per-bit write enable for the 32-bit data entering the pixel buffer. for the mwb operation, each bit of the plane mask register serves as a per-bit write enable for the 32-bit word 0 entering the sense amplifiers of the a dram bank, and the same write masking mechanism is applied to the upper seven words of the specified pixel buffer block. figure 3.2 provides a clear illustration of this masking relationship between the write data and the bits of the plane mask register. the value ??means write enable; the value ??means write disable. this register resets to ffff ffffh. constant source register (csr [35:0] ) this register is used to store 36-bit data that is loaded from the palu_dq and palu_dx pins. (the data extension pins palu_dx [3:0] are loaded into the most significant four bits of the constant source register.) the bits of this register are commonly referred to as kx [3:0] for the most significant four bits and k [31:0] for the low-order 32 bits. the four rop/blend units and the dual compare units can individually select this register to provide data. this register resets to 0000 0000h. match mask register (mtm [31:0] ) this register determines which data bits participate in the match test. setting the bits of this register to ??causes the corresponding data bits to be compared by the match comparison unit. setting the bits of this register to ??causes the corresponding data bits to be ignored in the match test. this register resets to 0000 0000h. magnitude mask register (mgm [31:0] ) this register determines which data bits participate in the magnitude test. setting the bits of this register to ??causes the corresponding data bits to be compared by the magnitude comparison unit. setting the bits of this register to ??causes the corresponding data bits to be ignored in the magnitude test. this register resets to 0000 0000h. figure 3.17 identi?ation register data format 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 version part number manufacturer id 1
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 59 3 pixel alu operations figure 3.18 rop/blend control register data format rop/blend control register (rbc [31:0] ) this register controls the operations of the four rop/blend units. each rop/blend unit is independently controlled by an 8-bit field of this 32-bit register. bits 7 through 0 are repeated three more times for units 1, 2, and 3. that is, bits 15 through 8 for unit 1; bits 23 through 16 for unit 2; and bits 31 through 24 for unit 3. this register resets to 0303 0303h. this value passes data unchanged from the palu_dq pins through all four rop/blend units. during a stateless data write access, the rop/ blend units behave as if this register were set to 0303 0303h, regardless of its actual value. the data format of the rbc register is illustrated in figure 3.18 above and explained in the paragraph below. bits 8n+7 through 8n+6 select a source for multp1 (table 3.15). bit 8n+5 selects a source for rop unit n and for the adder in the blend unit n. if this bit is ?? the data from the palu_dq [8n+7:8n] is selected; if this bit is ?? the constant source register bits k [8n+7:8n] are selected. bit 8n+4 configures rop/blend unit n . for bit 28, the bit value of ??sets the rop/ blend unit 3 in rop and stencil mode and forces the output of alpha-saturate block to be always old[31:24] regardless of the programmed values in bld2[29:28] and pbc[29:28]; the bit value of ??sets the rop/blend unit 3 in blend mode and enable the alpha saturate logic . for bits 4, 12 and 20, the bit value of ??sets the rop/ blend units 0, 1 and 2 in rop mode, respectively; the bit value of ??sets the rop/blend units 0, 1 and 2 in blend mode, respectively. note that when in blend mode, blend units 0, 1 and 2 will calculate with the correct alpah saturate value only when bit 28 of this register is also set to ?? note also that the bit field st.enable does not specifically set the operation mode of alu unit 3; st. enable enables the bit planes 31 through 24 to be recognized as stencil bits when alu unit 3 is in rop and stencil mode, and is ignored when alu unit 3 is in blend mode. (new rev.1.03) bits 8n+3 through 8n+0 select one of the sixteen possible raster operations for unit n. in table 3.16, ?ew?represents the data from the palu_dq [31:0] pins or from the 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 rop/blend unit 3 rop/blend unit 2 rop/blend unit 1 multp1 select rop/adder source select rop/blend select raster op. select table 3.15 multp1 source encoding for rop/blend unit n rbc [8n+7:8n+6] fraction source for rop/ blend unit n 00 100h (1.00) 01 {kx n , k [8n+7:8n] } 10 {palu_dx n , palu_dq [8n+7:8n] } 11 {palu_dx 3 , palu_dq [31:24] }
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 60 3 pixel alu operations constant source register bits k [31:0] (as selected by bit 8n+5), ?ld?represents the 32-bit data from pixel buffer, and ??means logical inversion. all of these operations are bit-wise logical operations. compare control register (ccr [31:0] ) this register controls the picking logic and the dual compare unit, and thereby indirectly influ- ences the status of the pass_out pin. only 12 bits of this register are currently defined. the other 20 bits are reserved. bits 31 through 28, 23 through 18, 15 through 11, and 7 through 3 are reserved. they are written as ??for future compatibility. bits 27 through 24 control the picking logic. bits 25 and 24 clear or set the hit flag. bits 27 and 26 enable or disable the picking logic. the encoding tables are in table 3.17 and table 3.18. it is helpful to note that after bits 27 and 25 are loaded with ?? these bits are automatically reset to ??in the next mclk cycle, thereby restoring the state machine to the ?o change?state and saving the follow-up register writes. in this sense, writing ??into bits 27 and 25 generates one-shots at the outputs of ccr [27] and ccr [25] . figure 3.19 compare control register data format table 3.16 raster operation encoding rbc [8n+3:8n+0] raster operation 0000 all bits zero 0001 new and old 0010 new and ~old 0011 new 0100 ~new and old 0101 old 0110 new xor old 0111 new or old 1000 ~new and ~old 1001 ~new xor old 1010 ~old 1011 new or ~old 1100 ~new 1101 ~new or old 1110 ~new or ~old 1111 all bits one table 3.17 pick enable encoding ccr [27:26] function 0x no change to pick enable ?g 10 disable picking logic 11 enable picking logic table 3.18 pick hit encoding ccr [25:24] function 0x no change to hit ?g 10 clear hit ?g 11 set hit ?g 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 reserved reserved reserved reserved picking logic dual compare source decal stencil match test magnitude test
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 61 3 pixel alu operations bits 17 through 16 select the source for the dual compare unit. bit 16 directly controls the match compare source, while the result of bit 17 xor bit 16 controls the magnitude compare source. in this way, the ?st two codes are compatible with the previous generations, m5m410092 and m5m410092a which had bit 17 reserved and always set to ?? (new ) bit 11 previously enables invert stencil mode in m5m410092a but is now reserved and should be always written as ?? warning! the invert stencil mode is removed from this device M5M410092B, and an incompatibility exists between this device M5M410092B and its previous gen- eration, m5m410092a, with respect to this function. bit 10 enables the decal stencil mode (see the section on ?tencil modes on page 40). ? selects the normal rendering operation, where stateful write is enabled when pass_in [1:0] and pass_out all are high. ? selects the decal stencil mode, where the stateful write is enabled in one of the two conditions: (1) pass_in [1:0] and pass_out are all high; or (2) pass_in [1:0] are high and match compare output is low (failing the match test). bits 9 through 8 select one of four tests for the match compare unit (table 3.20). bits 2 through 0 select one of eight tests for the magnitude compare unit (table 3.21). during a stateful data write to the pixel buffer, the pixel data is actually written only when the magnitude test, the match test, and the external pass_in pin all pass. the pass_out pin is set to pass only when the magnitude test and the match test both pass. while the picking logic is enabled, all stateful data writes which pass both compare tests (pass_out high) while pass_in is high will set (i.e., or a ??into) the hit flag. the hit flag will then remain set until cleared by writing ?0?into bits 25 and 24. the hit flag is active high, and when it is ?? it drives the open-drain hit pin low. this register resets to 0a00 0000h, which means table 3.19 dual compare source selection encoding ccr [17:16] magnitude compare source match compare source comments 00 palu_dq pins palu_dq pins backward compatible 01 constant source register constant source register backward compatible 10 constant source register palu_dq pins new feature 11 palu_dq pins constant source register new feature table 3.20match test encoding ccr [9:8] test condition 00 always pass 01 never pass 10 pass if new == old 11 pass if new != old table 3.21magnitude test encoding ccr [2:0] test condition 000 always pass 001 pass if new > old 010 pass if new == old 011 pass if new >= old 100 never pass 101 pass if new <= old 110 pass if new != old 111 pass if new < old
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 62 3 pixel alu operations the hit flag is cleared and the picking logic is disabled. during a stateless data write access, the dual compare unit behaves as if this register were set to 0000 0000h, regardless of its actual value. write address control register (wac [31:0] ) only 1 bit of this register is currently used for the pixel alu function. the other 31 bits are reserved. bit 0 selects the source for the pixel buffer write address. ??selects the pixel buffer write address from the palu_a [5:0] pins. ?? selects the pixel buffer write address from the palu_dq [29:24] pins. bits 31 through 1 are reserved. they are written as ??for future compatibility. this register resets to 0000 0000h. during a stateless data write, the write address control register behaves as if this register were set to 0000 0000h, regardless of its actual value. an application of the write address control register the write address control register is used to speed up vertical scroll in screen display. taking advantage of the pipeline structure, reading data from one pixel buffer location and writing into another location can be achieved in one stateful data write. the 1-bit write address control register selects the pixel buffer write address between the palu_a [5:0] pins (the normal path) and the palu_dq [29:24] pins (the vertical scroll acceleration path). for the vertical scroll in a screen as illustrated in figure 3.20, the data in pixel a is to be moved to pixel b. assume that pixel data a is stored in pixel buffer [block 3:word 0] and that pixel b is in [block 0:word 5]. figure 3.21 shows the pipeline flow for the write address selection, and figure 3.12 shows the data stream for the example in figure 3.20. before the stateful data write to move pixel a data to pixel b location is started, four registers should be set as follows: write into write address control register with ?? write into rop/blend control register with 0505 0505h to select old data. figure 3.20 pixel movement in vertical scroll screen display b a [block 0 : word 5] in the pixel buffer [block 3 : word 0] in the pixel buffer m1027
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 63 3 pixel alu operations ?rite into compare control register with 0000 0000h to pass data into the the pixel buffer. ?rite into plane mask register with ffff ffffh to pass every bit into the pixel buffer for the stateful data write. the stateful data write issued later should have the read address asserted on the palu_a [5:0] pins and the write address on the palu_dq [29:24] pins at the next cycle. seven cycles later in the pipeline path, the pixel a data will be written into the pixel b location. figure 3.21 pipeline ?w of the write address control figure 3.22 pipeline for performing a vertical scroll m1026 6 5 4 3 pixel buffer write addr read addr write data read data 10 6 5 4 3 2 6 5 4 3 2 1 7 palu_a [5 . . 0] palu_dq [29 . . 24] palu_dq [29:24] palu_a [5:0] write data read data write address read address pixel buffer mclk 0 12345678 9 000110 000100 000101 000000 011000 palu_op palu_a palu_we palu_dq 111 111 111 111 011 011 stfl write 0000 0001h 0505 0505h 0000 0000h ffff ffffh 000101 read addr. write addr. stfl write read addr. write addr. m1025
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 64 3 pixel alu operations figure 3.23 blend_2 control register data format blend_2 control register(bld2 [31:0] ) (new) this register provides additional control of the multiplicands and addends for the four blend units. each blend unit is independently controlled by a 4-bit field of this 32-bit register. bits 3 through 0 are repeated three more times on byte boundaries for units 1, 2, and 3. that is, bits 11 through 8 for unit 1; bits 19 through 16 for unit 2; and bits 27 through 24 for unit 3. in addition, bits 29 and 28 are used to select the output of the alpha-saturate block. this output can then be selected by each of the rop/blend units as the source for the second multiplicand, multp2. the data format of this register is illustrated in figure 3.23 above. bits 31 through 30, 23 through 20, 15 through 12, and 7 through 4 are reserved. they shall be written as ??for future compatibility. bits 29 through 28 determine the output of the alpha-saturate block (table 3.22), when rbc [28] =1 (see also the description of the rop/blend control register). this output can then be selected by any of the four units as the source of the second multiplicand, multp2, using bld2 [8n+3:8n+2] (new rev.1.03) . bits 8n+3 through 8n+2 select the source for the second multiplicand, multp2 (table 3.23) bit 8n+1 selects the source for the first multiplicand, multp1. if this bit is ?? the data selected by bits [8n+7:8n+6] of the rop/blend control register is used; if this bit is ?? the old [8n+7:8n] data is used. 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 r=reserved r r=reserved r=reserved unit 3 unit 2 unit 1 multp2 select multp1 select addend select alpha-saturate output select table 3.22 encoding for output of alpha- saturate block bld2 [29:28] output of alpha-saturate block 00 min{new [31:24] , ~old [31:24] } ( a -sat) 01 new [31:24] (as) 10 old [31:24] (ad) 11 ~old [31:24] (~ad) table 3.23 encoding for multp2 source selection bld2 [8n+3:8n+2] multiplicand 2 (multp2) 00 old [8n+7:8n] 01 ~old [8n+7:8n] 1x data selected by bits [29:28] of this register
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 65 3 pixel alu operations bit 8n selects the source for the addend. if this bit is ?? the data selected by bit [8n+5] of the rop/blend control register is used; if this bit is ?? the old [8n+7:8n] data is used. this register resets to 0000 0000h. this value causes the rop/blend units to operate based on the settings in the rop/blend control register, which assures compatibility with previous generations of the 3d-ram. during a stateless data write access, the rop/ blend units behave as if this register were set to 0000 0000h regardless of its actual value. figure 3.24 preblend control register data format preblend control register(pbc [31:0] ) (new) (new r e v .1.03) this register controls the first cycle, known as the preblend cycle, of the two-cycle loop back operation for the four rop/blend units by selecting multp2 for the preblend cycle and the addend for the second, or normal, cycle (see the section on ?ixel alu blend factpr selections?on page 26). during the preblend cycle, multp1 is fixed to palu_dq [8n+7:8n] and addend is fixed to {palu_dx n , palu_dq [8n+7:8n] }. each rop/blend unit is independently controlled by an 4-bit field of this 32-bit register. bits 3 through 0 are repeated three more times on byte boundaries for units 1, 2, and 3. that is, bits 11 through 8 for unit 1; bits 19 through 16 for unit 2; and bits 27 through 24 for unit 3. since there is only one alpha-saturate block (located in rop/blend unit 3), the selection made by bits 29 and 28 applies to all four blend units. the data format of this register is illustrated in figure 3.24 above. bits 31, 30, 25, 23 through 20, 17, 15 through 12, 9, 7 through 4, and 1 are reserved. they shall be written as ??for future compatibility. bits 29 through 28 determine the output of the alpha-saturate block (table 3.24), when rbc[28]=1 (see also the description of the rop/blend control register). this output can then be selected by any of the four units as the source of the second multiplicand, multp2, using pbc [8n+3:8n+2]. (new rev.1.03) 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 r=reserved r r=reserved r=reserved unit 3 unit 2 unit 1 multp2 select addend select alpha-saturate output select r table 3.24 encoding for output of alpha- saturate block pbc [29:28] output of alpha-saturate block 00 min{new [31:24] , ~old [31:24] } ( a -sat) 01 new [31:24] (as) 10 old [31:24] (ad) 11 ~old [31:24] (~ad)
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 66 3 pixel alu operations bits 8n+3 through 8n+2 select the source for the second multiplicand, multp2, for the preblend cycle(table 3.25) bit 8n selects the source for the addend for the second, or normal cycle. if this bit is ?? the multiplier output from the preblend cycle is ?ooped back?to become the addend for the normal cycle; if this bit is ?? the addend from the preblend cycle is ?ooped back?to become the addend for the normal cycle. this register resets to 0000 0000h. this value causes the rop/blend units to operate based on the settings in the rop/blend control register, which assures compatibility with previous generations of the 3d-ram. during a stateless data write access, the rop/ blend units behave as if this register were set to 0000 0000h regardless of its actual value. figure 3.25 stencil planes register data format stencil planes register(stpl [31:0] ) (new) this register defines the bits allocated for stencil planes in opengl stencil mode and the value masking of these stencil planes when the selected stencil comparison function is performed against the stencil reference value. the on-chip stencil hardware acceleration features are implemented in rop/blend unit 3. therefore, only bits 31 through 24 of the 32-bit alu unit are available for use as stencil planes. any number of bits from 0 through 8 may be allocated as stencil planes. in order for the gl_increment and gl_decrement functions to work properly, it is necessary to keep the stencil planes in one contiguous group. bits 31 through 24 = st.enable [7:0] provide a bitwise selection of which bits are allocated as stencil planes in opengl mode. bits that are enabled are subject to the controls of the stencil control register. bits that are disabled are available for use by rop unit 3 or the dual compare unit. for each bit, a ??disables that bit for opengl stencil mode. a ??enables that bit for opengl stencil mode. again, for the increment and decrement stencil functions to work properly, it is necessary that the enabled stencil planes be in one contiguous group. for example, if st.enable [7:0] is set to the value 0011 1100, then bits 29 through 26 of the alu unit are used for stencil planes and bits 31, 30, 25, and 24 may be used for rop or compare functions. note that the blending function in rop/blend unit 3 cannot be used if any bits in this unit are used for opengl stencil. bits 23 through 16 = stf.mask [7:0] provide a bitwise stencil value mask for stencil comparison functions. this mask maps directly to bits 31 through 24 of the 32-bit bus. that is, bit 23 of this register will mask/unmask bit 31 of the alu bus; bit 22 will mask/unmask bit 30 of the alu bus, and so on. for each bit, a ??will cause the corresponding bit to be ignored in stencil table 3.25 encoding for multp2 source selection for preblend cycle pbc [8n+3:8n+2] multp2 source for preblend cycle 00 old [8n+7:8n] 01 ~old [8n+7:8n] 1x data selected by bits [29:28] of this register 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 reserved stf.mask [7:0] st.enable [7:0]
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 67 3 pixel alu operations comparison functions. a ??will cause the corresponding bit to be used in the stencil comparison functions. for convenience and clarity, the mnemonic used here corresponds to the opengl terminology. specifically, ?tf?refers to the opengl command ?lstencilfunc?with ?ask? referring to the parameter mask ?of this command. thus, this bit field also corresponds to the symbolic parameter ?l_stencil_value_mask? bits 15 through 0 are reserved. they shall be written as ??for future compatibility. this register resets to 00ff 0000h, which means that all stencil planes are disabled. during a stateless data write access, the pixel alu behaves as if this register were set to 00ff 0000h regardless of its actual value. figure 3.26 stencil control register data format stencil control register(stc [31:0] ) (new) this register defines the stencil comparison function and the stencil operations that will be performed for the opengl stencil mode. only 12 bits of this register are currently used for stencil hardware acceleration, and the other 20 bits are reserved. for convenience and clarity, the mnemonics used here correspond to the opengl terminology. specifically, ?top?refers to the opengl command ?lstencilop?with ?pass? ?fail? and fail?referring to the parameters of this command; ?tf?refers to ?lstencilfunc?with ?unc?referring to the parameter func ?of this command. also, it is important to note that the bits programmed in this register are effective only in the case when both pass_in pins are ??and the match compare passes, since otherwise no data will be written into the pixel buffer and the pass_out pin will be ?? bits 31, 27, 23, 19, and 15 through 0 are reserved. they shall be written as ??for future compatibility. bits 30 through 28 = stop.fail [2:0] define the stencil operation to be executed in the case of gl_stencil_fail. that is, these bits determine which one of the stencil operations listed in table 3.26 will be performed when the stencil compare function fails. disabled bits keep their old data. in the gl_incr operation, the maximum value is defined as 2 n -1, where n=the number of bits enabled by st.enable [7:0] . 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 r = reserved r r r stop.zpass [2:0] stf.func [2:0] stop.zfail [2:0] stop.fail [2:0] stf.ref_select [2:0]
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 68 3 pixel alu operations bits 26 through 24 = stop.zfail [2:0] define the stencil operation to be executed in the case of gl_stencil_pass_depth_fail. that is, these bits determine which one of the stencil operations listed in table 3.27 will be performed when the stencil compare function passes, but the magnitude compare function fails. disabled bits keep their old data in the gl_incr operation, the maximum value is defined as 2 n -1, where n=the number of bits enabled by st.enable [7:0] . bits 22 through 20 = stop.zpass [2:0] define the stencil operation to be executed for the case of gl_stencil_pass_depth_pass. that is, these bits determine which one of the stencil operations listed in table 3.28 will be performed when the magnitude compare and stencil compare functions both pass. disabled bits use the rop unit results. in the gl_incr operation, the maximum value is defined as 2 n -1, where n=the number of bits enabled by st.enable [7:0] . table 3.26 stencil operation for stop.fail stop.fail [2:0] stencil operation de?ition 000 gl_zero enabled bits cleared to zero 001 gl_keep enabled bits remain old data 010 gl_invert enabled bits are inverted old 011 gl_replace enabled bits are replaced by stf.ref 100 110 gl_incr enabled bits are incremented by 1 (clamped to max. value) 101 111 gl_decr enabled bits are decremented by 1 (clamped to zero value) table 3.27 stencil operation for stop.zfail stop.zfail [2:0] stencil operation de?ition 000 gl_zero enabled bits cleared to zero 001 gl_keep enabled bits remain old data 010 gl_invert enabled bits are inverted old 011 gl_replace enabled bits are replaced by stf.ref 100 110 gl_incr enabled bits are incremented by 1 (clamped to max. value) 101 111 gl_decr enabled bits are decremented by 1 (clamped to zero value)
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 69 3 pixel alu operations bits 19 = stf.ref_select defines the source of the stf.ref stencil data. setting this bit to 0 selects the stf.ref from the pins palu_dq [31:24] ; setting this bit to 1, the bits 31 through 24 in the constant source register will be used as the stf.ref. bits 18 through 16 = stf.func [2:0] define the stencil comparison function. the stf.ref stencil data (from the pins palu_dq [31:24] ) is compared with the old stencil data based on the settings of these register bits. the mnemonic ?tf?refers to the opengl command ?lstencilfunc?with ?ef?referring to the parameter ref ?of this command. table 3.29 defines which test will be used for the stencil comparison. this register resets to 3330 0000h, which means that the stencil test always passes. during a stateless data write access, the pixel alu behaves as if this register were set to 3330 0000h regardless of its actual value. table 3.28 stencil operation for stop.zpass stop.zpass [2:0] stencil operation de?ition 000 gl_zero enabled bits cleared to zero 001 gl_keep enabled bits remain old data 010 gl_invert enabled bits are inverted old 011 gl_replace enabled bits are replaced by stf.ref 100 110 gl_incr enabled bits are incremented by 1 (clamped to max. value) 101 111 gl_decr enabled bits are decremented by 1 (clamped to zero value) table 3.29 stencil comparison functions stf.func [2:0] stencil test de?ition 000 gl_always always pass stencil test 001 gl_greater pass stencil test if ( stf.ref && stf.mask ) > ( old && stf.mask ) 010 gl_equal pass stencil test if (stf.ref && stf.mask ) == ( old && stf.mask) 011 gl_gequal pass stencil test if ( stf.ref && stf.mask ) >= ( old && stf.mask) 100 gl_never always fail stencil test 101 gl_lequal pass stencil test if ( stf.ref && stf.mask ) <= ( old && stf.mask) 110 gl_notequal pass stencil test if ( stf.ref && stf.mask ) != ( old && stf.mask) 111 gl_less pass stencil test if ( stf.ref && stf.mask ) < ( old && stf.mask)
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 70 3 pixel alu operations figure 3.27 pass_in select register data format pass_ins select register (pins [31:0] ) (new) only 2 bits of this register are currently used for the pixel alu function, and the other 30 bits are reserved. bit 0 enables the pass_in [1] pin to participate in the internal write enable logic of the pixel buffer. setting this bit to ?? disables the pass_in [1] which is then internally set to ??and has no effect on the operation of the pixel alu. setting this bit to ??enables and passes through the pass_in [1] signal to be anded with the pass_in [0] signal. see also figure 3.13 for illustration. bit 8 enables the pass_in [0] pin to participate in the internal write enable logic of the pixel buffer. setting this bit to ?? disables the pass_in [0] which is then internally set to ??and has no effect on the operation of the pixel alu. setting this bit to ??enables and passes through the pass_in [0] signal to be anded with the pass_in [1] signal. see also figure 3.13 for illustration. bits 31 through 1 are reserved. they shall be written as ??for future compatibility. this register resets to 0000 0100h. this value assures compatibility with previous generations of the 3d-ram. color depth select register(cds [31:0] ) (new) only 1 bit of this register is currently used for the pixel alu function, and the other 31 bits are reserved. bit 0 selects the color depth for pixel alu operations. ??selects the normal (8,8,8,8) 32-bit blending mode. also, with this setting, color data can be stored in the (5,6,5,0) 16-bit mode. ??selects the (4,4,4,4) 16-bit blending mode. for details on the 16-bit color modes, refer to the section titled ?6-bit color modes?in this chapter. bits 31 through 1 are reserved. they shall be written as ??for future compatibility. this register resets to 0000 0000h. this value assures compatibility with previous generations of the 3d-ram. note one mclk cycle must be inserted between (a) the write to the color depth select register and (b) the following alu operations: read pixel buffer, stateful initial data write, stateful normal data write, and initiate two-cycle blending. valid alu operations that provide such one-mclk- cycle insertion include alu nop, write control register, or a stateless operation. (new rev.1.03) note during a stateless data write, the color depth select register still behaves based on its current state. in other words, if bit 0 of the register is set to ?? the stateless write will be done in the (4,4,4,4) mode of operation. 0 3 1 3 2 9 2 8 2 7 2 6 2 3 2 5 2 4 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 3 1 4 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0 reserved reserved pass_in [0] select pass_in [1] select
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 71 3 pixel alu operations prohibited register access writing to control register address with palu_a [5:0] = ?11000?and palu_op [2:0] = ?11? for three consecutive rising edges of mclk will cause the device to enter into a special test mode and is strictly prohibited in order to avoid unexpected device behavior in the system. figure 3.28 prohibited register access mclk palu_op [2:0] palu_a [5:0] reset test_mode 1 2 3 wcr: "111" "011000" "l" test mode in test mode out
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 72 3 pixel alu operations pixel data operations there are six pixel data operations: stateless initial data write, stateless normal data write, stateful initial data write, stateful normal data write, replace dirty tag, and or dirty tag. simply put, stateless data writes refer to the condition that the states of the pixel alu units are entirely ignored and that the write data is passed to the pixel buffer unaffected, whereas in stateful data writes, the settings of the various registers in the pixel alu, the results of the compare tests, and the state of the pass_in pin all affect whether the bits of the pixel data will be written into the pixel buffer. initial and normal data writes refer to the manner in which the dirty tag is updated. in an initial data write, the bits of the dirty tag are selectively set and cleared. in a normal data write, the bits of the dirty tag associated with the addressed block and word are inclusive ored with the palu_be pins, and the other bits of the dirty tag are unchanged. the following sections describe these operations in details. stateless initial data write the stateless initial data write operation writes 32-bit data to the addressed block and word in the pixel buffer. no register values affect this operation. the rop/blend units simply pass the write data through without affecting it. the dual compare unit is ignored and does not inhibit the writing of data to pixel buffer. the pass_out pin is forced to ??for this operation. the pass_in pin has no effect. the corresponding four dirty tag bits for the addressed word are set to the respective palu_be [3:0] value of the 32-bit data. the other 28 dirty tag bits corresponding to the addressed block are cleared to ?? stateless normal data write the stateless normal data write operation writes 32-bit data to the addressed block and word in the pixel buffer. no register values affect this operation. the rop/blend units simply pass the write data through without affecting it. the dual compare unit is ignored and does not inhibit the writing of data to pixel buffer. the pass_out pin is forced to ??for this operation. the pass_in pin has no effect. the four dirty tag bits corresponding to the addressed block and word are inclusive ored with the palu_be [3:0] pin. the other 28 dirty tag bits corresponding to the addressed block are unchanged. stateful initial data write the stateful initial data write operation writes 32- bit data to the addressed block and word. the new data may be combined with the existing destination data. the conditional write enable applies. all register values can affect this operation. the four dirty tag bits corresponding to the addressed block and word are set to the palu_be [3:0] value. the other 28 dirty tag bits corresponding to the addressed block are cleared to ?? both the writing to the pixel buffer and the updating of the dirty tag can be inhibited by a compare test failure (which means that either pass_in or pass_out is low). stateful normal data write the stateful normal data write operation writes 32-bit data to the addressed block and word. the new data may be combined with the existing destination data. the conditional write enable applies. all register values can affect this operation.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 73 3 pixel alu operations the four dirty tag bits corresponding to the addressed block and word are inclusive ored with the palu_be [3:0] value. the other 28 dirty tag bits corresponding to the addressed block are unchanged. both the writing to the pixel buffer and the updating of the dirty tag can be inhibited by a compare test failure (which means that either pass_in or pass_out is low). replace dirty tag the 32-bit data on the palu_dq [31:0] pins replaces the dirty tag of the addressed block. the bit mapping between the dirty tag and palu_dq pins is explained on pages 22 and 42. the palu_be [3:0] pins determine which byte of the palu_dq [31:0] data gets written into the dirty tag ram. the dirty tag data passes through the rop portion of the rop/blend units. all of the registers behave the same way they would during a stateless data write. or dirty tag the 32-bit data on the palu_dq [31:0] pins is inclusive ored with the dirty tag of the addressed block. the bit mapping between the dirty tag and palu_dq pins is explained on pages 22 and 42. the palu_be [3:0] pins determine which byte of the palu_dq [31:0] data gets written into the dirty tag ram. the dirty tag data passes through the rop portion of the rop/ blend units. all of the registers behave the same way they would during a stateless data write. initiate two-cycle blending this operation initiates a preblend cycle, which is the first cycle of the two-cycle blend operation. the preblend cycle is similar to a stateful write, except that the preblend cycle does not actually write the data back to the pixel buffer or affect the dirty tags in any way. when this operation is issued, the rop/blend units begin blending the data based on the settings of three registers: rop/blend control, blend_2 control, and preblend control. after the multiplier stage of the preblend cycle, the multiplier output and the addend are looped back as possible addends for the next cycle, which is called the normal cycle. the preblend cycle must always be followed by a stateful initial/normal write with the same pixel buffer address on the palu_a pins. this operation is only for blending. the rop/blend control register must be set to perform blending for all rop/blend units or this operation will not function correctly. see the paragraphs on ?ixel alu blend modes?starting on page 26 for further details.
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 74 3 pixel alu operations
dram operations 4

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 75 4 dram operations dram operations this chapter discusses the 3d-ram operations involving the dram arrays. these include the data transfers between a dram bank and the pixel buffer and between a dram bank and a video buffer. an overview of dram operations depending on the dram_op code, the dram_a address pins may be interpreted in three different ways: (1) page access (for access page and duplicate page operations), (2) block access (for read block, unmasked write block, and masked write block operations), and (3) scan line access (for video transfer operation). a page access selects one page out of 257 pages (256 normal pages plus one extra page). dram_a8 is used to select the extra page1?is for the extra page, ??is for choosing one of the 256 normal pages from a given bank. when dram_a8 is equal to ?? the lower eight address pins dram_a [7:0] should still be driven to stable states although they are not decoded internally. the position and orientation of all pages displayed on the screen are controlled by the user. however, the mapping of data within a given page to a pixel buffer block is fixed and is shown in figure 4.1. (more precisely from the perspective of dram operations, we should speak of the mapping between the sense amplifiers of a selected dram bank with a pixel buffer block. however, since the page-wide sense amplifiers act as a direct-mapped write-through pixel cache for a dram bank, the mapping between the sense amplifiers of a dram bank and the pixel buffer is the same as the mapping between a dram page and the pixel buffer. for convenience, the latter reference is used liberally in this document.) a dram page is always organized as 10 blocks wide and 4 blocks high; this is fixed. a block always contains eight 32-bit words, for a total of 256 bits. in the case of 8 bits per pixel, the eight words in a given block may be viewed as 8 pixels wide by 4 pixels high. thus, a dram page would be mapped to the screen as 80 pixels wide by 16 pixels high with 8 bits per pixel. in the case of 32 bits per pixel, the eight words in a given block may be viewed as 2 pixels wide by 4 pixels high. thus, a dram page would be mapped to the screen as 20 pixels wide by 16 pixels with 32 bits per pixel. for simplicity, we represent these frame buffer organizations with the short hand notations 80w x 16h x 8 and 20w x 16h x 32, respectively. several frame buffer organization examples are shown in chapter 6. during a data transfer between a dram page and the pixel buffer, both the block location in the pixel buffer and the block location in the dram page must be specified. in the pixel buffer, the selection of one of eight blocks is through the dram_a[ 8:6] pins. in the height direction of a dram page, the dram_a [1:0] pins select one of four block rows. in the width direction of the page, a block is selected from the ten block columns through the dram_a [5:2] pins. figure 4.1 illustrates the addressing scheme for block transfer with a block configured as 8w x 4h x 8. the hexadecimal number written on every block of the dram page corresponds to the six address pins dram_a [5:0] . similarly, the number on the pixel buffer block is from the other address bits dram_a [8:6] .
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 76 4 dram operations description of dram operations table 4.1 dram operation encoding table 4.1 lists all of the dram operations. one operation can be launched in every cycle. however, the sequence of these dram operations is bounded by the resource interlocks. the access page can only be issued after precharge bank, and the only operation after precharge bank is access page. tables 7.6 and 7.7 contain the specific timing interlocks for dram operations in the same bank and between different banks. figure 4.1 addressing scheme for block transfer on the global bus, for a block size of 8w x 4h x 8 (or 2w x 4h x 32). the blocks in the dram page are numbered with hexadecimal values and selected by dram_a [5:0] . operation dram_op dram_bs dram_a unmasked write block (uwb) 000 bank pixel buffer block(3 pins), dram block(6 pins) masked write block (mwb) 001 bank pixel buffer block(3 pins), dram block(6 pins) precharge bank (pre) 010 bank video transfer (vdx) 011 bank control (2pins), line (4pins) duplicate page (dup) 100 bank page (9 pins) read block (rdb) 101 bank pixel buffer block(3 pins), dram block(6 pins) access page (acp) 110 bank page (9 pins) no operation (nop) 111 0123456 7 80 w 16h 00 04 08 0c 10 14 18 1c 20 24 01 05 09 0d 11 15 19 1d 21 25 02 06 0a 0e 12 16 1a 1e 22 26 03 07 0b 0f 13 17 1b 1f 23 27 selecting a block in the height direction from a dram page pixel buffer a page in a dram bank global bus 256 256 dram_a selecting a block in the width direction from a dram page selectin g one out of ei g ht blocks in the pixel buffer 0 1 2 3 4 5 6 7 8 [8..0]
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 77 4 dram operations unmasked write block (uwb) the uwb operation copies 32 bytes from the specified pixel buffer block over the global bus to the specified block in the sense amplifiers and the dram page of a selected dram bank. the dram_a [5:0] pins select one of the 40 blocks in a dram page. the dram_a[8:6] pins select one of the eight pixel buffer blocks. the 32-bit plane mask register has no effect on unmasked write block operation. the 32-bit dirty tag still controls which bytes of the block are updated. masked write block (mwb) the mwb operation copies 32 bytes from the specified pixel buffer block over the global bus to the specified block in the sense amplifier and the dram page of a selected dram bank. the dram_a [5:0] pins select one of the 40 blocks in a dram page. the dram_a [8:6] pins select one of the eight pixel buffer blocks. both the 32-bit dirty tag and the 32-bit plane mask register control which bytes of the block are updated. figure 4.2 unmasked write block, masked write block, and read block on the global bus pixel buffer 256 global bus (8w x 4h x 8) bank-b bank-c bank-d 1 page/257 1 block/40 rdb 257pages 257pages 257pages uwb mwb or (32 bytes) 256-bit global bus
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 78 4 dram operations precharge bank (pre) the pre operation first deactivates the word line corresponding to the most recently accessed dram page of a selected dram bank and then equalizes the bit lines of the sense amplifiers for a subsequent access page operation. after a precharge bank operation has been performed on a certain dram bank, the operations that can be performed on that dram bank are access page, precharge bank, and nop. other operations after a precharge bank operation are illegal, and the resulting data is undefined. video transfer (vdx) there are two parts to the vdx operation: video buffer load and video output. video buffer load relates to the transfer from the sense amplifiers of a selected dram bank to a corresponding video buffer. video output relates to the transfer from a video buffer to the vid_q pins. video buffer load there are two video buffers available for interleave transfer. video buffer i is for bank a and bank c. video buffer ii is for bank b and bank d. figure 4.3 illustrates a video transfer example from a page in bank a to video buffer i. figure 4.3 video transfer from a bank a page to video buffer i bank-c bank-d 257pages 1page/257 1page/257 bank-b bank-a video buffer ii video buffer i 257pages 16 vid_q video transfer
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 79 4 dram operations this paragraph describes the addressing scheme for the video transfer operation in detail. a dram page has a fixed organization of 10 blocks wide by 4 blocks high. for vdx operation, a 32-byte block is always considered as being 4 rows high (either 8w x 4h x 8 or 2w x 4h x 32). that is, for vdx operation, a dram page is always viewed as containing 16 rows of 80 bytes each. in the case of 8 bits per pixel, the video transfer operation transfers a 80w x 1h x 8 line of pixel data from the sense amplifiers of a dram page to the corresponding video buffer. the dram_a [3:0] pins are used to select one of the 16 rows in a dram page. since there are 16 vid_q pins, one may think of the video buffer as 40 double-bytes. the dram_a [6:4] pins are ignored in this operation but should still be driven to stable states. figure 4.4 addressing scheme for video transfer 0 1 2 14 15 80w 16h video buffer a dram page 0 1 2 3 4 5 6 7 8 dram_ a ignored selecting one line memory from the page 80 selecting byte pair mode when dram_a is "1" "0": normal mode "1": reversed mode "1": load dram_a into latch, initialize video counter. "0": the video output operation is not affected. [8..0] 8 7 ?? normanl mode ?? reversed mode
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 80 4 dram operations video output operation there are two byte order formats for the vid_q video output pins: normal mode and reversed mode. this byte ordering is selected by an internal byte pair mode latch, which is loaded from the dram_a7 pin when the dram_a8 pin is equal to ?? if the latched data is ?? the normal video output mode is applied to the vid_q bus. if the latch data is ?? the reversed video output mode is selected. since a video buffer holds 640 bits, we may number the bytes in the video buffer from byte 0 through 79. in both normal and reversed modes, even bytes always appear on the vid_q [7:0] pins, and odd bytes on vid_q [15:8] pins. in normal mode, the byte data is shifted out to the vid_q pins in normal sequence as in [byte 0, byte 1] at video clock 0, [byte 2, byte 3] at video clock 1 ? [byte 78, byte 79] at video clock 39. however, in many systems, byte 3 contains control bits, while bytes 0, 1, and 2 are the rgb data. therefore, it may be desirable to make byte 3 available on the first video clock to allow the ramdac chip the maximum time to make use of the control bits. the reversed mode is designed for this purpose, where the byte data is shifted in reversed sequence as in [byte 2, byte 3] at video clock 0, [byte 0, byte 1] at video clock 1 ?[byte 78, byte 79] at video clock 38, and finally [byte 76, byte 77] at video clock 39. in summary, the 16-bit vid_q bus output scheme is illustrated in figure 4.5 for a 80w x 1h x 8 video buffer. figure 4.5 16-bit vid_q bus output scheme for a 80w x 1h x 8 video buffer vid_clk 012 3 38 39 ?? vid_q vid_q vid_q vid_q 024 6 1 3 5 7 76 77 78 79 normal mode ?? vid_q vid_q vid_q vid_q 2064 3 1 7 5 78 79 76 77 reversed mod e vid_clk 80w x 1h x 8 video buffer 8-bit ?? 0 12 34 56 78 78 79 ?? 024 6 1 3 5 7 76 77 78 79 normal mode ?? 2064 3 1 7 5 78 79 76 77 reversed mod e 0 7 8 15 0 7 8 15
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 81 4 dram operations initialize and abort video output when the dram_a 8 pin is ?? the byte pair mode latch is loaded, and the current video buffer output operation is aborted. the vid_q bus is driven starting from the video buffer indicated by the dram_bs 0 pin. also, the modulo-40 video counter is initialized. if dram_a 8 is ?? the video counter is not affected. the video output from the current video buffer continues until this buffer is exhausted. then, the video buffer is automatically switched and the video counter is initialized. to avoid data corruption in the video buffer, the user should not start a video transfer operation to the video buffer that is outputting data to the vid_q bus. figure 8.15 shows an example of initiating a video output process. it begins with commanding a video transfer from the dram control port, while holding vid_cke signal low to disable internal vid_clk until vid_qsf changes. when vid_qsf indicates the specific video buffer is ready, the clock enable vid_cke is asserted to allow video output. for a normal mode video output, figure 8.16 illustrates an example of continuous video output from both video buffers by issuing consecutive video transfer operations on the four dram banks. note that vid_qsf settles from an unknown state to a known state after the initial video transfer with dram_a 8 = 1. except for this initial video transfer, the clean edge transition on vid_qsf is guaranteed for every occurrence of video buffer interleave. prohibited video operation sequence performing vdx operation with dram_a [8:0] = ?1xxxxxxx?and reset = ??for eight consecutive rising edges of mclk will cause the device to enter into a special manufacturing test mode and is strictly prohibited to avoid unexpected device behavior in the system. figure 4.6 prohibited video operation sequence mclk dram_op [2:0] dram_a [8:0] reset test_mode 128 "011" (vdx) "0 1xxx xxxx" "001" (mwb) "h" "h" test mode in test mode out
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 82 4 dram operations duplicate page (dup) all 10,240 bits of the data in the sense amplifiers of a selected dram bank can be transferred to any specified page in the same bank within one duplicate page operation. the data in the sense amplifiers is not affected by this operation. if the dram_a 8 pin is 0, then the dram_a [7:0] pins select one of the 256 normal pages. if dram_a 8 is 1, then the dram_a [7:0] pins are ignored and the extra page is written. the plane mask register does not apply to this operation. it may be helpful to point out that it is not necessary to use the dup operation to write back the data in the sense amplifiers, because they function as a level-two write-through pixel cache. dup is a special performance function that offers ultra-fast data movement in a frame buffer. consider the task of clearing the entire frame buffer of 1280 x 1024 x 32. using only the mwb operations for this task, the 256-bit global bus and four bank interleaving plus parallel operations to the four 3d-ram chips offer very good bandwidth. the data rate is 5.8 gb/s for the -10 grade of 3d-ram, and the entire screen is cleared in 860 m s, without considering the interruptions of video refresh. however, with the dup performance function, the data rate increases ten- fold to 58.6 gb/s, and the entire screen is cleared in only 85 m s, with the same -10 grade of 3d ram. figure 4.7 duplicate page in dram bank a read block (rdb) the rdb operation copies 32 bytes from the sense amplifiers of a selected dram bank over the global bus to the specified block in the pixel buffer. the corresponding 32-bit dirty tag is cleared. the dram_a [5:0] pins select one of the 40 blocks in a dram page. the dram_a [8:6] pins select one of the eight pixel buffer blocks. the read block operation is also illustrated in figure 4.2. access page (acp) t he acp operation activates the word line corresponding to the specified dram page of a selected dram bank and transfers the data in the dram array to the sense amplifiers. if the dram_a8 pin is ?? then the dram_a[ 7:0] pins select one of the 256 normal pages. if the dram_a8 pin is ?? then the dram_a [7:0] pins are ignored and the extra page is transferred. before an access page operation can be performed on a certain dram bank, a precharge bank operation must have been performed for that dram bank. after an access page operation, several dram read and write operations, such as uwb, mwb, rdb, dup, and vdx, may be performed. bank-b bank-c bank-d page page to pixel buffer dup 257 pages 257 pages 257 pages
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 83 4 dram operations figure 4.8 access page means transferring a speci?d page to the sense ampli?rs. sense amp page bank
rev. 1.03 3d-ram (M5M410092B) electronic device group m itsubishi 84 4 dram operations no operation (nop) the nop operation may be freely inserted between the acp operation and the pre operation on the same bank. nops are issued when the dram arrays are idle, no read or write is required by the pixel buffer, and no video buffer load is necessary. more importantly, nops are required to satisfy the timing interlocks of the various dram operations, as listed in tables 7.6 and 7.7; for this application, each nop operation simply takes one clock period.
pixel alu pipelines and dram activities 5

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 85 5 pixel alu pipelines and dram activities pixel alu pipelines and dram activities this chapter inculdes some pipeline examples of the interaction between the global bus and the pixel alu, as well as some typical sequences of dram operations on the same bank. for dram operations, we assume that the clock cycle time equals to the minimum requirements of the specification, depending on the speed grade of the parts. if the 3d-ram is not running at the minimum mclk cycle time, then the dram operations shown in the tables of this chapter do not govern the cycles of operations. the interlocks listed in tables 7.6 and 7.7 are always the governing parameters that determine the cycles of dram operations. these interlocks specify time durations only and are independent of the clock period and the number of clock cycles, unless specifically noted otherwise. dram and pixel alu interactions the global bus and the pixel alu interact as shown in the following tables. a word on the notation may be in order here. braces are used to enclose more than one specifications that can qualify the operation outside the braces. for example, stateful {initial, normal} data write means either stateful initial data write or stateful normal data write may be applied. in fact, the table entries shorten this notation to simply stateful data write. table 5.1 shows a read block operation immediately followed by a read data operation that uses data from the read block. table 5.1 read block on global bus to read data on pixel alu cycle dram activities pixel alu activities n read block op speci?d on dram_en, dram_op n+1 read block on global bus n+2 read block on global bus read data op speci?d on palu_en, palu_we, palu_op n+3 read data op speci?d on palu_en, palu_we, palu_op data read from pixel buffer n+4 data read from pixel buffer data on palu_dq n+5 data on palu_dq
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 86 5 pixel alu pipelines and dram activities table 5.2 shows a read block operation immediately followed by a stateful {initial, normal} data write operation that performs a read modify write on the data from the read block. table 5.2 read block on global bus to stateful {initial, normal} data write on pixel alu cycle dram activities pixel alu activities n read block op speci?d on dram_en, dram_op n+1 read block on global bus n+2 read block on global bus stateful data write op speci?d on palu_en, palu_we, palu_op n+3 old data read from pixel buffer new data read from palu_dq, palu_dx n+4 rop/blend 1 n+5 rop/blend 2 n+6 rop/blend 3 n+7 rop/blend 4 n+8 result data written to pixel buffer
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 87 5 pixel alu pipelines and dram activities table 5.3 shows a {stateless, stateful} {initial, normal} data write operation immediately followed by a {masked, unmasked} write block operation. table 5.3 {stateless, stateful} {initial, normal} data write on pixel alu {masked, unmasked} write block on global bus cycle dram activities pixel alu activities n data write op speci?d on palu_en, palu_we, palu_op n+1 old data read from pixel buffer new data on palu_dq, palu_dx n+2 rop/blend 1 n+3 rop/blend 2 n+4 rop/blend 3 n+5 rop/blend 4 n+6 {masked, unmasked} write block op speci?d on dram_en, dram_op result data written to pixel buffer n+7 {masked, unmasked} write block on global bus n+8 {masked, unmasked} write block on global bus
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 88 5 pixel alu pipelines and dram activities table 5.4 shows a {replace, or} dirty tag operation immediately followed by a {masked, unmasked} write block operation. table 5.4 {replace, or} dirty tag on pixel alu to {masked, unmasked} write block on global bus cycle dram activities pixel alu activities n dirty tag op speci?d on palu_en, palu_we, palu_op n+1 dirty tag data on palu_dq, n+2 pass through rop/blend 1 n+3 pass through rop/blend 2 n+4 pass through rop/blend 3 n+5 pass through rop/blend 4 n+6 {masked, unmasked} write block op speci?d on dram_en, dram_op data written to dirty tag n+7 {masked, unmasked} write block on global bus n+8 {masked, unmasked} write block on global bus
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 89 5 pixel alu pipelines and dram activities table 5.5 shows a write register operation to the plane mask register followed by the latest masked write block operation that can use the previous contents of the plane mask register. table 5.5 write register (plane mask) on pixel alu to masked write block on global bus cycle dram activities pixel alu activities n write register op speci?d on palu_en, palu_we, palu_op n+1 plane mask data on palu_dq n+2 n+3 masked write block op speci?d on dram_en, dram_op n+4 masked write block on global bus (uses old plane mask value) n+5 masked write block on global bus (uses old plane mask value) n+6 plane mask register loaded n+7 n+8 n+9
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 90 5 pixel alu pipelines and dram activities table 5.6 shows a write register operation to the plane mask register followed by the earliest masked write block operation that can use the new plane mask. table 5.6 write register (plane mask) on pixel alu to masked write block on global bus cycle dram activities pixel alu activities n write register op speci?d on palu_en, palu_we, palu_op n+1 plane mask data on palu_dq n+2 n+3 n+4 n+5 n+6 masked write block op speci?d on dram_en, dram_op plane mask register loaded n+7 masked write block on global bus (uses new plane mask value) n+8 masked write block on global bus (uses new plane mask value) n+9
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 91 5 pixel alu pipelines and dram activities dram activities this section discusses consecutive dram operations on the same bank. to illustrate interlock timing for dram activities, we assume that the clock cycle time equals the minimum specification requirements?0ns or 13ns ? depending on the speed grade of the parts. the interlock timing restrictions are listed in table 7.6 for the dram operations on the same bank and in table 7.7 for the dram operations on different banks. table 5.7 shows a minimal length video refresh sequence. table 5.7 video refresh sequence cycle external activities internal activities n access page speci?d n+1 access page n+2 access page n+3 access page n+4 video transfer speci?d access page n+5 video transfer n+6 video transfer n+7 video transfer n+8 precharge bank speci?d video transfer n+9 precharge bank n+10 precharge bank n+11 precharge bank n+12 precharge bank
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 92 5 pixel alu pipelines and dram activities table 5.8 shows minimal length dram refresh sequence. table 5.8 dram refresh sequence cycle external activities internal activities n access page speci?d n+1 access page n+2 access page n+3 access page n+4 access page n+5 n+6 n+7 n+8 precharge bank speci?d n+9 precharge bank n+10 precharge bank n+11 precharge bank n+12 precharge bank
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 93 5 pixel alu pipelines and dram activities table 5.9 shows a sequence of read block and {masked, unmasked} write block operations. table 5.9 sequence of read block and {masked, unmasked} write block operations cycle external activities internal activities n access page speci?d n+1 access page n+2 access page n+3 access page n+4 1 st read block speci?d access page n+5 1 st read block n+6 2 nd read block speci?d 1 st read block n+7 2 nd read block n+8 1 st write block speci?d 2 nd read block n+9 1 st write block n+10 2 nd write block speci?d 1 st write block n+11 2 nd write block n+12 precharge bank speci?d 2 nd write block n+13 precharge bank n+14 precharge bank n+15 precharge bank n+16 precharge bank
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 94 5 pixel alu pipelines and dram activities table 5.10 shows duplicate page sequence. table 5.10 duplicate page sequence cycle external activities internal activities n access page speci?d n+1 access page n+2 access page n+3 access page n+4 duplicate page speci?d access page n+5 duplicate page n+6 duplicate page n+7 duplicate page n+8 duplicate page n+9 duplicate page n+10 duplicate page n+11 duplicate page n+12 precharge bank speci?d duplicate page n+13 precharge bank n+14 precharge bank n+15 precharge bank n+16 precharge bank
frame buffer organizations 6

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 95 6 frame buffer organizations frame buffer organizations introduction there are many ways to use the 3d-ram to implement frame buffers of various resolutions and depths. this section describes the following frame buffer organizations: 1280 x 1024 x 8 organization in single chip 1280 x 1024 x 32, organized as four 1280 x 1024 x 8 or 320 x 1024 x 32 1280 x 1024 x 32 double buffered organiza- tion with 32-bit z 640 x 512 x 8 double buffered organization with 16-bit z in single chip 1280 x 1024 x 8 organization in this organization, the screen display is made up of an 8w x 32h array of page groups (that is, 8 page groups wide by 32 page groups high). a page group is 160-pixel wide by 32-pixel high and consists of the same page from all four dram banks (a, b, c, d). the four independent dram banks can be interleaved to allow pages to be prefetched as images are drawn. each page within a page group is 80-pixel wide by 16-pixel high. pages are either sliced into sixteen 80-pixel wide scan lines when sending data to its video buffer or they are diced into a 10w x 4h array of 256-bit blocks when dealing with the global bus. two pixels are shifted out of the video buffer every video clock. blocks are 8-pixel wide by 4-pixel high and can be transferred to and from one of the pixel buffer blocks via the global bus. the pixel alu and data pins access four pixels of a pixel buffer block at a time. the dirty tag for an entire pixel buffer block can be written in a single cycle from the data pins. the following formulas determine which bank, page, etc. a given pixel is in, given the x and y coordinates of the pixel. the formulas use c syntax where the percent sign (?? indicates integer modulus operation and the slash sign (?? indicates integer division. these formulas are valid only when 0 x < 1280 and 0 y < 1024. ? bank = 2*((y%32)/16) + (x%160)/80, [0 = bank a, 1 = bank b, 2 = bank c, 3 = bank d] ? page = 8*(y/32) + x/160 ? scan line within page = y%16 ? block within page = (y%16)/4 + 4* ((x%80)/8) ? word within block = 2*(y%4) + (x%8)/4 ? pixel (byte) within word = x%4 the mapping of page groups to the display screen is completely user definable. the following mappings are hardwired inside the 3d-ram: blocks to pages, scan lines to pages, words to pixel buffer blocks, and dirty tags to pixel buffer blocks. 1280 x 1024 x 32 single buffered organization a frame buffer of this size requires four 3d-rams; however, there are two recommended ways of organizing the 3d-rams which trade off 2d color expansion rendering performance with pixel oriented rendering performance. ? each of the four components of a pixel (r, g, b, a) are in separate 3d-rams. thus, each 3d-ram supports 1280 x 1024 x 8. this section describes this implementation. ? all four components of a pixel reside in the same 3d-rams. the four 3d-rams are interleaved on a pixel by pixel basis in a scan line. thus, each 3d-ram supports 320 x 1024 x 32. page 97 describes this implementation. the 320 x 1024 x 32 mode is nearly the same as 1280 x 1024 x 8 except that the pixels are four times as deep and the widths of the screen, page groups, pages, and blocks are one fourth as wide. one pixel is shifted out of the video buffer every two video clocks. the pixel alu and palu_dq pins access one pixel of a pixel buffer block. the dirty tag for an entire pixel buffer block can be written in a single cycle from the palu_dq pins.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 96 6 frame buffer organizations the dirty tag controls the four bytes of 32-bit pixel independently. the following formulas determine which bank, page etc. a given pixel is in, given the x and y coordinates of the pixel. ? bank = 2*((y%32)/16) + (x%40)/20, [0 = bank a, 1= bank b, 2 = bank c, 3 = bank d] ? page = 8*(y/32) + x/40 ? scan line within page = y%16 ? block within page = (y%16)/4 + 4* ((x%20) /2) ? pixel (word) within block = 2*(y%4) + (x%2) figure 6.1 this diagram shows how 3d-ram maps to pixels in a single-chip 1280x1024x8 frame buffer. the num- bers outside each rectangle show its dimensions in pixels. 1280 1024 01 7 15 23 247 255 89 16 17 249 248 240 241 0(a) 1(b) 2(c) 3(d) 160 32 16 80 80 16 0 4 8 12 16 20 24 28 32 36 1 5 9 13 17 21 25 29 33 37 2 6 10 14 18 22 26 30 34 38 3 7 11 15 19 23 27 31 35 39 8 4 4 1 0 2 4 6 7 1 3 5 01 23 0 1 78 79 video buffer dram_a [3:0] 80 1 the screen is 8 page groups wide by 32 page groups high. dram_a [1:0] dram_a [5:2] palu_a [2:1] palu_a 0 palu_be0, palu_dq [7:0] palu_be1, palu_dq [15:8] palu_be2, palu_dq [23:16] palu_be3, palu_dq [31:24] video buffer data is shifted out two bytes at a time on each vid_clk. each page can be divided into either 40 blocks, which can be accessed from the global bus, or 16 scan lines which can be accessed by the video buffer. blocks can be accessed in 4-pixel words by the pixel alu. a page group consists of the same page from all four dram banks (a, b, c, d).
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 97 6 frame buffer organizations figure 6.2 1280 x 1024 x 32 single buffer 3d-ram system 16 3d-ram rendering controller 3d-ram 3d-ram 3d-ram ramdac monitor system interface 32 32 32 32 16 16 16 pixel data address & control video data video data video data video data video control
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 98 6 frame buffer organizations figure 6.3 this diagram shows how 3d-ram maps to pixels in a single chip 320x1024x32 frame buffer. the num- bers outside each rectangle show its dimensions in pixels. 320 1024 0 1 7 15 23 247 255 89 16 17 249 248 240 241 0(a) 1(b) 2(c) 3(d) 40 32 16 20 20 16 0 4 8 12 16 20 24 28 32 36 1 5 9 13 17 21 25 29 33 37 2 6 10 14 18 22 26 30 34 38 3 7 11 15 19 23 27 31 35 39 2 4 0 2 4 6 7 1 3 5 0 1 18 19 video buffer dram_a [3:0] 20 1 the screen is 8 page groups wide by 32 page groups high. dram_a [5:2] dram_a [1:0] palu_a [2:1] palu_a 0 palu_dq [31:0] video buffer data is shifted out two bytes at a time on each vid_clk. a pixel is shifted out every two cycles. blocks can be accessed in 1-pixel words by the pixel alu. each page can be divided into either 40 blocks, which can be accessed from the global bus, or 16 scan lines which can be accessed by the video buffer. a page group consists of the same page from all four dram banks (a, b, c, d).
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 99 6 frame buffer organizations 1280 x 1024 x 32 double buffered organization with z the basic configuration for a 1280 x 1024 x 32 double buffered organization with z buffer is shown in figure 6.4. this configuration uses only twelve 3d-rams. in this example each 3d-ram (for buffers a, b, and z) covers a 320 x 1024 portion of the 1280 x 1024 displayed image. the interleave is in the x direction. this implies that vertical scrolling can take place at a very high speed because all data movement occurs within the 3d-ram chips rather than across chips. horizontal scrolling would require 3d-ram to 3d-ram data transfers. each of buffers a, b, and z is 32 bits in pixel depth. this allows 8 bits each for r, g, b, and 8 bits for alpah or overlays, and we refer to the eight 3d-rams containing these data as the color buffer 3d-rams. in the case of z buffer, 24 bits can be used for depth and 8 bits for a combination of stencil pattern id and window id, and we refer to these four 3d-rams as the z buffer 3d-rams. the rendering controller is shown with a 256-bit interface for maximum performance. the three 3d-rams (one from each of buffers a, b, and z) that hold the data for the same pixels share a 64- bit bus. more specifically, the two 3d-ram chips in the buffers a and b share the same 32-bit data bus because only one of them is active for rendering and the other outputs display data throught the video port, while the 3d-ram chip in the z buffer requires its own 32-bit data bus. a 64- bit or 128-bit bus between the rendering controller and the 3d-rams could be used but with some loss of performance due to more restricted bandwidth and higher bus loading (implying lower maximum clock frequency). the z buffer 3d-ram utilize their compare units to check depth, stencil and window id, and supply the result to the pass_out pins. the pass_out pins of z buffer are connected to the pass_in pins of the corresponding color buffer 3d-rams. the results of all alu operations are conditionally written to the pixel buffer, depending on the states of the pass_in pins (and on the states of the pass_out pins of the color buffer 3d-rams themselves if they also perform compare tests). both buffers a and b are connected to the ramdac chip using a 128-bit bus. buffers a and b can be selected on a pixel-by-pixel basis, alternating between the two buffers.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 100 6 frame buffer organizations figure 6.4 1280 x 1024 x 32 double buffered organization with 32-bit z buffer 16 16 16 16 16 3d-ram rendering controller 3d-ram 3d-ram ramdac monitor system interface 64 16 address & control video data video control 3d-ram z buffer 3d-ram z buffer 3d-ram z buffer 3d-ram z buffer buffer a buffer a buffer a buffer b buffer b buffer b pass_out pass_in 16 16 buffer b 3d-ram buffer a 64 64 64 32 32 32 32 32 32 32 32
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 101 6 frame buffer organizations 640 x 512 x 8 double buffered organization with z a single 3d-ram chip can be configured to support 640 x 512 x 8 double buffered organization with 16-bit z. this configuration might be suitable for a very high performance, low cost consumer home or arcade game application. the basic allocation of memory can be seen in figure 6.6. one fourth of the 3d-ram serves as buffer a, one fourth as buffer b, and the rest as the 16-bit z buffer. all z compares and rop/ blend functions are done on the same 3d-ram. a 32-bit pixel and z data bus is provided to the rendering controller. a 16-bit bus interfaces to the ramdac. figure 6.5 using single 3d-ram to con?ure double buffered 640 x 512 x 8 with 16-bit z 512 32 640 buffer a, 640 x 512 x 8 buffer b, 640 x 512 x 8 z buffer, 640 x 512 x 16 rendering controller 3d-ram ramdac 16 32
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 102 6 frame buffer organizations figure 6.6 this diagram shows how 3d-ram maps to pixels in a single chip 640x512x8 frame buffer. the numbers outside each rectangle show its dimensions in pixels. 640 512 01 15 31 47 239 255 16 17 32 33 241 240 232 233 0(a) 1(b) 2(c) 3(d) 40 32 16 20 20 16 0 4 8 12 16 20 24 28 32 36 1 5 9 13 17 21 25 29 33 37 2 6 10 14 18 22 26 30 34 38 3 7 11 15 19 23 27 31 35 39 2 4 0 2 4 6 7 1 3 5 0 1 18 19 video buffer dram_a [3:0] 20 1 the screen is 16 page groups wide by 16 page groups high. dram_a [5:2] dram_a [1:0] palu_a [2:1] palu_a 0 palu_dq [31:24] buffer a video buffer data is shifted out two bytes at a time on each vid_clk. a pixel is shifted out every two cycles. blocks can be accessed in 1-pixel words by the pixel alu. each page can be divided into either 40 blocks, which can be accessed from the global bus, or 16 scan lines which can be accessed by the video buffer. a page group consists of the same page from all four dram banks (a, b, c, d). palu_dq [23:16] buffer b palu_dq [15:0] z buffer z buffer data is ignored. buffer a or b is selected by the ramdac chip or external logic.
electrical specitcations 7

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 103 7 electrical speci?ations electrical speci?ations absolute maximum ratings table 7.1 absolute maximum ratings testing conditions the supply voltage vdd and ambient temperature ta for testing are as follows: vdd = 3.3 v 5%, ta = 0 c to 70 c figure 7.1 shows the output test load for the palu_dq, pass_out, vid_q, and vid_qsf pins. the capacitive loading cl is 60 pf for the palu_dq pins, 30 pf for the pass_out pin, and 20 pf for both the vid_q pins and the vid_qsf pin. figure 7.2 is the output test load for the open-drain hit pin, with rpu = 330 w for pull- up and cl = 75 pf. figure 7.1 output test load for the palu_dq, pass_out, vid_q, and vid_qsf pins symbol parameter conditions ratings unit vdd supply voltage with respect to vss - 0.5 to 4.6 v vi input voltage - 0.5 to 4.6 v vo output voltage - 0.5 to 4.6 v io output current 50 ma tj maximum junction temperature 125 c topr operation temperature 0 to 70 c tstg storage temperature - 65 to 150 c m1038 palu_dq pass_out vid_q vid_qsf c l i l c l = 60pf for palu_dq 30pf for pass_out 20pf for vid_q, vid_qsf
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 104 7 electrical speci?ations figure 7.2 output test load for the hit pin the ac timing measurements are summarized in figure 7.3 through figure 7.6. the clock waveform measurements are shown in figure 7.3. the input and output timing measurements are shown in figure 7.4 and figure 7.5, respectively. figure 7.6 shows the asynchronous output enable timing measurements. figure 7.3 clock waveform measurement m1039 hit c l i l c l = 50pf r pu = 330 w +3.3v r pu m1036 clock 1.5v t 3 t 2 t 1 2.0v 0.8v t 1 : clock cycle time (minimum) t 2 : clock high pulse width (minimum) t 3 : clock low pulse width (minimum)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 105 7 electrical speci?ations figure 7.4 input timing measurement figure 7.5 output timing measurement t 6 : reset setup time (minimum) t 7 : reset pulse width (minimum) t 8 : input setup time (minimum) t 9 : input hold time (minimum) m1037 clock 1.5v t 6 t 6 reset input 1.5v 0.8v 0.8v 2.0v 2.0v 0.8v t 9 t 8 t 7 t 10 : clock to output low impedance, i o > 2 * i oz (minimum) t 11 : output access time from clock (maximum) t 12 : output valid time after clock (minimum) t 13 : clock to output high impedance, i o < i oz (maximum) m1034 clock 1.5v t 12 t 11 t 10 t 11 t 12 t 13 output 2.4v 0.4v
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 106 7 electrical speci?ations figure 7.6 asynchronous output enable timing measurement figure 7.7 scan_tdo timing measurement t 14 : valid output after oe low (minimum) t 15 : output high impedance, i o < i oz , after oe low (maximum) t 16 : output low impedance, i o > 2 i oz , after oe high (minimum) t 17 : valid output after oe high (maximum) m1035 oe (active high) 0.8v t 14 t 15 output 0.4v 2.4v 2.0v t 16 t 17 0.4v 2.4v t 18 : clock to output low impedance, i o > 2 i oz (minimum) t 19 : output access time from clock (maximum) t 20 : output valid time after clock (minimum) t 21 : clock to output high impedance, i o < i oz (maximum) m1046 clock 1.5v t 20 t 19 t 18 t 19 t 20 t 21 output 2.4v 0.4v
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 107 7 electrical speci?ations dc specifications table 7.2 lists the dc characteristics and the operation conditions. table 7.3 lists the average supply current according to the operations, which include the pixel alu, dram and video operations. table 7.2 dc characteristics a. this parameter applies to every input pin except pass_in [1:0] . b. this parameter applies to every input pin except pass_in [1:0] . c. this parameter applies to all output pins except pass_out and hit . d. this parameter applies to all output pins except pass_out and hit . v dd = 3.3v 5%, t a = 0 c~70 c symbol parameter min max unit v ih a input high voltage 2.0 v dd + 0.3 v v il b input low voltage - 0.3 0.8 v v ih (pass_in [1:0] ) pass_in [1:0] high voltage 1.5 v dd + 0.3 v v il (pass_in [1:0]} ) pass_in [1:0] low voltage - 0.3 0.9 v v oh c output high voltage, i l = - 0.2 ma 2.4 v v ol d output low voltage, i l = 0.2 ma 0 0.4 v v oh (pass_out) pass_out high voltage, i l =-0.1 ma 1.9 v v ol (pass_out) pass_out low voltage, i l = 0.1 ma 0.5 v v oh (hit ) hit high voltage v v ol (hit ) hit low voltage 0.8 v i oz output leakage current in tri-state - 10 10 m a i il input leakage current - 10 10 m a c in input capacitance 5 pf c clk clk input capacitance 7 pf c i/o i/o capacitance 7 pf
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 108 7 electrical speci?ations table 7.3 average supply current by function symbol parameter M5M410092B unit -10a, -10 -12 i cc average supply current for alu operation t clk = min, (1 mclk cycle) 260 215 ma i cc average standby current t clk = 20 20 ma i cc average supply current for dram operation acp t clk = min, (4 mclk cycles) 105 85 ma i cc
 average supply current for dram operation pre t clk  = min, (4 mclk cycles) 55 40 ma i cc average supply current for dram operation dup t clk  = min, (8 mclk cycles) 55 40 ma i cc average supply current for dram operation rdb t clk  = min, (2 or 3 mclk cycles) 160 130 ma i cc average supply current for dram operation uwb t clk  = min, (2 or 3 mclk cycles) 160 130 ma i cc average supply current for dram operation vdx  t clk  = min, (4 mclk cycles) 80 60 ma i cc average supply current for video output t vclk  = min 70 70 ma

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 109 7 electrical speci?ations ac specifications every ac timing parameter is illustrated in at least one of the timing figures in chapter 8. the ?efer? column in each timing table refers to the exact measurement levels illustrated in figures 7.3 through 7.6. pixel alu timing parameters the timing parameters of M5M410092B-10 and -12 are presented in table 7.4. table 7.4 pixel alu timing parameters symbol parameter M5M410092B unit refer ch. 8 figure -10a -10 -12 min max min max min max t clk master clock mclk cycle time 10 16000 10 [ 16000 12 16000 ns t 1 4 t clkh mclk high pulse width 4? 5 nst 2 4 t clkl mclk low pulse width 4? 5 nst 3 4 t rss reset setup time 0? 0 nst 6 1 t rsp reset pulse width 40 40 48 ns t 7 2 t ens palu_en setup time 3? 4 nst 8 4 t enh palu_en hold time 1.5 1.5 1.5 ns t 9 4 t ops palu_op setup time 3? 4 nst 8 4 t oph palu_op hold time 1.5 1.5 1.5 ns t 9 4 t ads palu_a setup time 3? 4 nst 8 4 t adh palu_a hold time 1.5 1.5 1.5 ns t 9 4 t dqs palu_dq, palu_dx setup time 3? 4 nst 8 5 t dqh palu_dq, palu_dx hold time 1.5 1.5 1.5 ns t 9 5 t wes palu_we setup time 3? 4 nst 8 4 t weh palu_we hold time 1.5 1.5 1.5 ns t 9 4 [ t clk = 10.0 ns except that for the alpha saturate logic t clk = 12.0 ns.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 110 7 electrical speci?ations t bes palu_be setup time 3? 4 nst 8 4 t beh palu_be hold time 1.5 1.5 1.5 ns t 9 4 t clz mclk to palu_dq low impedance 4? 5 nst 10 4 t cq palu_dq access time ?4?4 18nst 11 4 t cvd palu_dq data val- id time 4? 4 nst 12 4 t chz mclk to palu_dq high im- pedance ?? 4nst 13 4 t pss pass_in setup time 2? 3 nst 8 5 t psh pass_in hold time 0? 0 nst 9 5 t cps mclk to valid pass_out ?? 8nst 11 5 t cpsv pass_out data valid time 3? 3 nst 12 5 t cht mclk to valid hit ?5?5 35nst 11 6 symbol parameter M5M410092B unit refer ch. 8 figure -10a -10 -12 min max min max min max table 7.4 pixel alu timing parameters (con?.)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 111 7 electrical speci?ations dram timing parameters the measurements of the dram interlock timings in tables 7.6 and 7.7 are from the mclk rising edge of the first operation to the mclk rising edge of the second operation. both mclk edges are measured at 1.5 v. table 7.5 minimum requirements of the dram timing parameters symbol parameter M5M410092B unit refer ch. 8 figure -10a, -10 -12 t ref refresh interval for array 17 17 ms t dens dram_en setup time 3 4 ns t 8 7 t denh dram_en hold time 1.5 1.5 ns t 9 7 t dops dram_op setup time 3 4 ns t 8 7 t doph dram_op hold time 1.5 1.5 ns t 9 7 t dbks dram_bs setup time 3 4 ns t 8 7 t dbkh dram_bs hold time 1.5 1.5 ns t 9 7 t dads dram_a setup time 3 4 ns t 8 7 t dadh dram_a hold time 1.5 1.5 ns t 9 7
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 112 7 electrical speci?ations table 7.6 minimum requirements of the dram interlock timings for operations on same bank a. the maximum timing limit from acp to pre is 100,000 ns. b. the operation from pre to acp requires at least two clock cycles. at the ?st clock rising edge, pre starts. at the second clock rising edge, the preparation for acp starts. symbol parameter M5M410092B unit ch. 8 timing figure -10a, -10 -12 t dabs access page to block transfer 36 36 ns 7 t daps a access page to precharge bank 60 72 ns 7 t dads access page to duplicate page 48 48 ns 8 t davs access page to video transfer 40 48 ns 9 t dbbs block transfer to block transfer 20 24 ns 7 t dbps block transfer to precharge bank 20 24 ns 7 t dbds block transfer to duplicate page 20 24 ns 8 t dbvs block transfer to video transfer 20 24 ns 10 t dpas b precharge bank to access page 40 48 ns 7 t dpps precharge bank to precharge bank 10 12 ns 10 t ddbs duplicate page to block transfer 80 96 ns 8 t ddps duplicate page to precharge bank 80 96 ns 9 t ddds duplicate page to duplicate page 80 96 ns 8 t ddvs duplicate page to video transfer 80 96 ns 9 t dvbs video transfer to block transfer 40 48 ns 10 t dvps video transfer to precharge bank 20 24 ns 9 t dvds video transfer to duplicate page 40 48 ns 9 t dvvs video transfer to video transfer 80 96 ns 9
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 113 7 electrical speci?ations symbol parameter M5M410092B unit ch. 8 timing figure -10a, -10 -12 t daad access page to access page 40 48 ns 11 t dabd access page to block transfer 10 12 ns 11 t dapd access page to precharge bank 40 48 ns 11 t dadd access page to duplicate page 40 48 ns 12 t davd access page to video transfer 40 48 ns 13 t dbad block transfer to access page 10 12 ns 11 t dbbd block transfer to block transfer 20 24 ns 11 t dbpd block transfer to precharge bank 10 12 ns 11 t dbdd block transfer to duplicate page 10 12 ns 11 t dbvd block transfer to video transfer 10 12 ns 13 t dpad precharge bank to access page 10 12 ns 11 t dpbd precharge bank to block transfer 10 12 ns 11 t dppd precharge bank to precharge bank 10 12 ns 11 t dpdd precharge bank to duplicate page 10 12 ns 11 t dpvd precharge bank to video transfer 10 12 ns 13 t ddad duplicate page to access page 80 96 ns 12 t ddbd duplicate page to block transfer 10 12 ns 12 t ddpd duplicate page to precharge bank 40 48 ns 12 t dddd duplicate page to duplicate page 80 96 ns 12 t ddvd duplicate page to video transfer 80 96 ns 13 t dvad video transfer to access page 40 48 ns 13 t dvbd video transfer to block transfer 10 12 ns 13 t dvpd video transfer to precharge bank 20 24 ns 13 t dvdd video transfer to duplicate page 40 48 ns 13 t dvvd video transfer to video transfer 80 96 ns 13
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 114 7 electrical speci?ations video buffer timing parameters table 7.7 video buffer timing parameters symbol parameter M5M410092B unit refer ch. 8 figure -10a, -10 -12 min max min max t vclk vid_clk cycle time 12 12 ns t 1 14 t vclkh vid_clk high pulse width 5 5 ns t 2 14 t vclkl vid_clk low pulse width 5 5 ns t 3 14 t vces vid_cke setup time 4 4 ns t 8 14 t vceh vid_cke hold time 0 0 ns t 9 14 t vq vid_q access time from vid_clk 8 8 ns t 11 14 t vqvc vid_q valid after vid_clk 3 3 ns t 12 14 t vlz vid_q output low impedance 3 3 ns t 16 14 t vhz vid_q output high impedance 3 3 ns t 15 14 t vqe vid_q access time from vid_oe high 9 9 ns t 17 14 t vqve vid_q valid after vid_oe low ns t 14 14 t vxci1 initial vdx after last internal vid_clk 14 14 ns 15 t vxci2 initial vdx before next internal vid_clk 80 80 ns 15 t vxqfi vid_qsf delay time after initial vdx 80 80 ns t 11 15 t qsf vid_qsf delay time from internal vid_clk 38 25 25 ns t 11 16 t vxc1 normal vdx after internal vid_clk 38 20 20 ns 16 t vxc2 normal vdx before next internal vid_clk 38 60 60 ns 16
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 115 7 electrical speci?ations boundary-scan timing parameters table 7.8 boundary-scan timing parameters symbol parameter M5M410092B -10a, -10, -12 unit refer ch. 8 figure min max t sclk scan_tck cycle time 100 ns t 1 17 t sclkh scan_tck high pulse width 40 ns t 2 17 t sclkl scan_tck low pulse width 40 ns t 3 17 t scnts scan_tms setup time 8 ns t 8 17 t scnth scan_tms hold time 26 ns t 9 17 t scnis scan_tdi setup time 8 ns t 8 17 t scnih scan_tdi hold time 26 ns t 9 17 t slz scan_tck to scan_tdo low impedance 20 ns t 18 17 t sq scan_tdo access time 26 ns t 19 17 t svd scan_tdo data valid time 8 ns t 20 17 t shz scan_tck to scan_tdo high impedance 20 ns t 21 17 t scnrs scan_rst setup time 8 ns t 6 18 t scnrp scan_rst pulse width 30 ns t 7 18
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 116 7 electrical speci?ations
timing diagrams 8

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 117 8 timing diagrams timing diagrams this chapter shows 18 timing diagrams, which are summarized in table 8.1. these diagrams show the gross timing specifications. refer to chapter 7 for exact timing measurements. also, the entries of parameter values from tables 7.4 through 7.9 are repeated here for convenient reference. table 8.1 timing diagram figures figure description 8.1 power on reset 8.2 restart reset 8.3 dram array initialization 8.4 pixel port read 8.5 pixel port write 8.6 pick logic timing 8.7 dram operations on the same bank (1) 8.8 dram operations on the same bank (2) 8.9 dram operations on the same bank (3) 8.10 dram operations on the same bank (4) 8.11 dram operations between two different banks (1) 8.12 dram operations between two different banks (2) 8.13 dram operations between two different banks (3) 8.14 internal vid_clk and video output timing 8.15 video output sequence from initial vdx for normal and reversed modes 8.16 continuous video output sequence in normal mode during display 8.17 boundary scan 8.18 boundary scan reset
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 118 8 timing diagrams figure 8.1 power on reset figure 8.2 restart reset mclk > 500? > 9 cycles reset internal function v dd t rss stabilize internal power supply reset registers initialize dram array start normal operation m1020 mclk > 9 cycles reset internal function t rss reset registers initialize dram array start normal operation t rss t rsp m1021 table 8.2 reset timing parameters symbol parameter M5M410092B unit refer -10a, -10 -12 min max min max t rss reset setup time 0 0 ns t 6 t rsp reset pulse width 40 48 ns t 7
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 119 8 timing diagrams figure 8.3 dram array initialization mclk dram_op dram_bs internal function acp acp acp acp pre pre pre pre abcdac bd initialize dram array start normal operation mclk dram_op dram_bs internal function acp pre acp acp pre acp pre pre aab c bd cd initialize dram array start normal operation or m1019
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 120 8 timing diagrams note: refer to figure 2.6 for an example of combined operations of pixel alu read and write. figure 8.4 pixel port read figure 8.5 pixel port write m1018 t ens palu_dq [8n+7..8n] t enh t oph t ops t weh t wes t adh t ads t beh t bes t clz t cq t cvd t cq t clk t clkh t clkl 111 000 000111 block : word valid id valid data palu_be n palu_a palu_we palu_op palu_en mclk t chz mclk 1 23456789 10 111 100 101 010 or 011 000 or 001 register block:xxx block:xxx block:word reg data dirty tag dirty tag new data new data reg data new data new data pass_in palu_en palu_op palu_we palu_a palu_be n palu_dq [8n+7..n] palu_dx n pass_in pass_out t dqs t dqh t pss t psh t cps t cpsv m1017
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 121 8 timing diagrams table 8.3 pixel alu timing parameters symbol parameter M5M410092B unit refer -10a -10 -12 min max min max min max t clk master clock mclk cycle time 10 16000 10 [ 16000 12 16000 ns t 1 t clkh mclk high pulse width 4 4 5 ns t 2 t clkl mclk low pulse width 4 4 5 ns t 3 t ens palu_en setup time 3 3 4 ns t 8 t enh palu_en hold time 1.5 1.5 1.5 ns t 9 t ops palu_op setup time 3 3 4 ns t 8 t oph palu_op hold time 1.5 1.5 1.5 ns t 9 t ads palu_a setup time 3 3 4 ns t 8 t adh palu_a hold time 1.5 1.5 1.5 ns t 9 t wes palu_we setup time 3 3 4 ns t 8 t weh palu_we hold time 1.5 1.5 1.5 ns t 9 t bes palu_be setup time 3 3 4 ns t 8 t beh palu_be hold time 1.5 1.5 1.5 ns t 9 t clz mclk to palu_dq low imped- ance 4 4 ?nst 10 t cq palu_dq access time 14 14 18 ns t 11 t cvd palu_dq data valid time 4 4 4 ns t 12 t chz mclk to palu_dq high imped- ance ? 4?nst 13 t dqs palu_dq, palu_dx setup time 3 3 4 ns t 8 t dqh palu_dq, palu_dx hold time 1.5 1.5 1.5 ns t 9 t pss pass_in setup time 2 2 3 ns t 8 t psh pass_in hold time 1.5 1.5 1.5 ns t 9 t cps mclk to valid pass_out 6 6 8 ns t 11 t cpsv pass_out data valid time 3 3 3 ns t 12 [ t clk = 10.0 ns except that for the alpha saturate logic t clk = 12.0 ns.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 122 8 timing diagrams note: 1. the hit signal is cleared by writing to the compare control register. 2. the hit signal can be set by the comparison result from the pass_in and pass_out pins, which are generated two cycles before the hit signal is. figure 8.6 picking logic timing table 8.4 picking logic timing parameter symbol parameter M5M410092B unit refer -10a, -10 -12 min max min max t cht mclk to valid hit 35 35 ns t 11 mclk 1 23456789 10 111 011 palu_en palu_op palu_we palu_a palu_be [2..0] palu_be 3 palu_dq [31..0] pass_in pass_out t cht 000101 block:word xexxxxxx data hit t cht note 1 note 2 m1016
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 123 8 timing diagrams this page is intentionally left blank.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 124 8 timing diagrams note: bkx means any block transfer operation, such as uwb, mwb, or rdb. figure 8.7 dram operations on the same bank (1) note: bkx means any block transfer operation, such as uwb, mwb, or rdb. figure 8.8 dram operations on the same bank (2) mclk pre dram_en dram_op dram_bs dram_a t doph nop acp nop bkx nop bkx nop pre a aaaa page block block page t dops t dens t dbkh t dbks t dadh t dads t dpas t dabs t dbbs t dbps t denh page m1015 mclk pre dram_en dram_op dram_bs dram_a nop acp nop bkx nop bkx nop pre a aaaa page block block page t daps page m1043 mclk acp dram_en dram_op dram_bs dram_a nop dup nop dup nop bkx nop dup a aaaa page page block page t dads t ddds t ddbs t dbds page m1014
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 125 8 timing diagrams a. the maximum timing limit from acp to pre is 100,000 ns. b. the operation from pre to acp requires at least two clock cycles. at the ?st clock rising edge, pre starts. at the second clock rising edge, the preparation for acp starts. table 8.5 minimum requiremens of the dram port interface timing parameters symbol parameter M5M410092B unit refer timing figure -10a, -10 -12 t dens dram_en setup time 3 4 ns t 8 8.7 t denh dram_en hold time 1.5 1.5 ns t 9 8.7 t dops dram_op setup time 3 4 ns t 8 8.7 t doph dram_op hold time 1.5 1.5 ns t 9 8.7 t dbks dram_bs setup time 3 4 ns t 8 8.7 t dbkh dram_bs hold time 1.5 1.5 ns t 9 8.7 t dads dram_a setup time 3 4 ns t 8 8.7 t dadh dram_a hold time 1.5 1.5 ns t 9 8.7 table 8.6 minimum requirements of the dram interlock timings for operations on same bank symbol parameter M5M410092B unit timing figure -10a, -10 -12 t dabs access page to block transfer 36 36 ns 8.7 t daps a access page to precharge bank 60 72 ns 8.7 t dads access page to duplicate page 48 48 ns 8.8 t dbbs block transfer to block transfer 20 24 ns 8.7 t dbps block transfer to precharge bank 20 24 ns 8.7 t dbds block transfer to duplicate page 20 24 ns 8.8 t dpas b precharge bank to access page 40 48 ns 8.7 t ddbs duplicate page to block transfer 80 96 ns 8.8 t ddds duplicate page to duplicate page 80 96 ns 8.8
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 126 8 timing diagrams figure 8.9 dram operations on the same bank (3) note: bkx means any block transfer operation, such as uwb, mwb, or rdb. figure 8.10 dram operations on the same bank (4) mclk acp dram_en dram_op dram_bs dram_a nop vdx nop dup vdx nop pre a aaaa page page line page t davs t dvds t ddvs t dvps line m1013 mclk acp dram_en dram_op dram_bs dram_a nop vdx nop dup vdx nop pre a aaaa page page line page t dvvs line t ddps m1044 mclk vdx dram_en dram_op dram_bs dram_a nop bkx nop vdx nop pre nop pre a aaaa line line page page t dvbs t dbvs t dpps block m1012
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 127 8 timing diagrams table 8.7 minimum requirements of the dram interlock timings for operations on same bank symbol parameter M5M410092B unit timing figure -10a, -10 -12 t davs access page to video transfer 40 48 ns 8.9 t dbvs block transfer to video transfer 20 24 ns 8.10 t dpps precharge bank to precharge bank 10 12 ns 8.10 t ddps duplicate page to precharge bank 80 96 ns 8.9 t ddvs duplicate page to video transfer 80 96 ns 8.9 t dvbs video transfer to block transfer 40 48 ns 8.10 t dvps video transfer to precharge bank 20 24 ns 8.9 t dvds video transfer to duplicate page 40 48 ns 8.9 t dvvs video transfer to video transfer 80 96 ns 8.9
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 128 8 timing diagrams note: bkx means any block transfer operation, such as uwb, mwb, or rdb. figure 8.11 dram operations between two different banks (1) figure 8.12 dram operations between two different banks (2) mclk pre dram_en dram_op dram_bs dram_a pre acp bkx nop bkx pre ac cd page block page t dbad t dbpd page dup b page acp bdb page page block t dppd t dpad t dabd t dpdd t dbbd m1011 mclk pre dram_en dram_op dram_bs dram_a pre acp bkx nop bkx pre ac cd page block page t dbdd page dup b page acp bdb page page block t dpbd t daad t dapd m1045 mclk acp dram_en dram_op dram_bs dram_a dup dup pre acp a bcad page page page page t dadd t ddbd t ddpd page bkx a block t dddd t ddad m1010
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 129 8 timing diagrams table 8.8 minimum requirement of the dram interlock timings for operations between two different banks symbol parameter M5M410092B unit timing figure -10a, -10 -12 t daad access page to access page 40 48 ns 8.11 t dabd access page to block transfer 10 12 ns 8.11 t dapd access page to precharge bank 40 48 ns 8.11 t dadd access page to duplicate page 40 48 ns 8.12 t dbad block transfer to access page 10 12 ns 8.11 t dbbd block transfer to block transfer 20 24 ns 8.11 t dbpd block transfer to precharge bank 10 12 ns 8.11 t dbdd block transfer to duplicate page 10 12 ns 8.11 t dpad precharge bank to access page 10 12 ns 8.11 t dpbd precharge bank to block transfer 10 12 ns 8.11 t dppd precharge bank to precharge bank 10 12 ns 8.11 t dpdd precharge bank to duplicate page 10 12 ns 8.11 t ddad duplicate page to access page 80 96 ns 8.12 t ddbd duplicate page to block transfer 10 12 ns 8.12 t ddpd duplicate page to precharge bank 40 48 ns 8.12 t dddd duplicate page to duplicate page 80 96 ns 8.12
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 130 8 timing diagrams figure 8.13 dram operations between two different banks (3) mclk dup dram_en dram_op dram_bs dram_a bkx vdx bkx nop acp dup ac da page page page t davd line vdx c line vdx bbb block line block t dbvd t dvbd t dvdd t dvvd pre d page t dpvd t dvad t dvpd t ddvd m1009 table 8.9 minimum requirement of the dram interlock timings for operations between two different banks symbol parameter M5M410092B unit -10a, -10 -12 t davd access page to video transfer 40 48 ns t dbvd block transfer to video transfer 10 12 ns t dpvd precharge bank to video transfer 10 12 ns t ddvd duplicate page to video transfer 80 96 ns t dvad video transfer to access page 40 48 ns t dvbd video transfer to block transfer 10 12 ns t dvpd video transfer to precharge bank 20 24 ns t dvdd video transfer to duplicate page 40 48 ns t dvvd video transfer to video transfer 80 96 ns
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 131 8 timing diagrams this page is intentionally left blank.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 132 8 timing diagrams note: 1. the deassertion of vid_cke at the current vid_clk rising edge will mask out the next internal vid_clk cycle. 2. timings are measured from the vid_clk pin or from the vid_oe pin. figure 8.14 internal vid_clk and video output timing note: 1. note that the vid_oe is always ??for video output enable is assumed. 2. timings are measured from the mclk pin or from the vid_clk pin. figure 8.15 video output sequence from intial vdx for normal and reversed modes vid_clk vid_cke vid_clk vid_oe vid_q data 1 data 3 data 4 data 5 data 2 data 2 t vclk t vq t vclkh t vclkl data 6 t vqe t vlz t vhz internal t vceh 1 1 t vces t vqvc t vqve m1006 internal vid_clk data 0 vid_q normal mode vid_qsf data 1 data 2 data 3 mclk dram_op dram_a [8..7] vdx 10 m1024 data 1 vid_q reversed mode data 0 data 3 data 2 t vq t vxqfi t vxci1 t vxci2 vid_clk vid_cke
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 133 8 timing diagrams table 8.10 video buffer timing parameters symbol parameter M5M410092B unit refer timing figure -10a, -10 -12 min max min max t vclk vid_clk cycle time 12 12 ns t 1 8.14 t vclkh vid_clk high pulse width 5?ns t 2 8.14 t vclkl vid_clk low pulse width 5?ns t 3 8.14 t vces vid_cke setup time 4?ns t 8 8.14 t vceh vid_cke hold time 0?ns t 9 8.14 t vq vid_q access time from vid_clk ??nst 11 8.14 t vqvc vid_q valid after vid_clk 3?nst 12 8.14 t vlz vid_q output low impedance 3?nst 16 8.14 t vhz vid_q output high impedance ??nst 15 8.14 t vqe vid_q access time from vid_oe high ??nst 17 8.14 t vqve vid_q valid after vid_oe low ns t 14 8.14 t vxci1 initial vdx after last internal vid_clk 14 14 ns 8.15 t vxci2 initial vdx before next internal vid_clk 80 80 ns 8.15 t vxqfi vid_qsf delay time after initial vdx 80 80 ns t 11 8.15
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 134 8 timing diagrams note: 1. t vxc1 specifies the earliest allowed normal vdx. 2. t vxc2 specifies the latest allowed normal vdx. figure 8.16 continuous video output sequence in normal mode during display vid_clk vid_q [15 . . 0] vid_qsf mclk dram_op vdx dram_a [8 . . 7] 1 0 373839 0 1 373839 0 1 dram_bs 0 vdx 1 vdx 2 m1022 0 1 2 37 38 39 0 1 2 38 0x 10 0x 39012 bank 0 bank 1 bank 2 t qsf t qsf t vq t vxc1 t vxc2 initial vdx during retrace normal vdx during retrace normal vdx during display vid_cke table 8.11 video buffer timing parameters symbol parameter M5M410092B unit refer -10a, -10 -12 min max min max t qsf vid_qsf delay time from internal vid_clk 38 25 25 ns t 11 t vxc1 normal vdx after internal vid_clk 38 20 20 ns t vxc2 normal vdx before next internal vid_clk 38 60 60 ns
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 135 8 timing diagrams figure 8.17 boundary scan scan_tck controller state scan_tms t sclkh t sclk test-logic reset run-test-idle shift-ir shift-ir exit-ir update-ir scan_tdi t sclkl t scnts t scnth t scnis t sq t shz t slz t svd scan_tdo t scnih note 1 m1004 table 8.12 boundary-scan timing parameters symbol parameter M5M410092B -10a, -10, -12 unit refer min max t sclk scan_tck cycle time 100 ns t 1 t sclkh scan_tck high pulse width 40 ns t 2 t sclkl scan_tck low pulse width 40 ns t 3 t scnts scan_tms setup time 8 ns t 8 t scnth scan_tms hold time 26 ns t 9 t scnis scan_tdi setup time 8 ns t 8 t scnih scan_tdi hold time 26 ns t 9 t slz scan_tck to scan_tdo low impedance 20 ns t 18 t sq scan_tdo access time 26 ns t 19 t svd scan_tdo data valid time 8 ns t 20 t shz scan_tck to scan_tdo high impedance 20 ns t 21
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 136 8 timing diagrams figure 8.18 boundary scan reset scan_rst scan_tck t scnrs t scnrp m1005 table 8.13 boundary-scan reset timing parameters symbol parameter M5M410092B -10a, -10, -12 unit refer min max t scnrs scan_rst setup time 8 ns t 6 t scnrp scan_rst pulse width 30 ns t 7
packaging 9

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 137 9 packaging packaging the 3d-ram is housed in 128-pin qfp(fp) and reverse qfp(rf) packages. for the purpose of convenient reference, the pinout diagrams for the 3d-ram are repeated on pages 137 and 138. page 103 contains the mechanical specification for the fp and rf packages. the thermal characteristics data for both packages is on page 142. 3d-ram pinouts there are two pinouts for 3d-ram: normal pinout with pin 1 located at the lower left hand corner and specially marked by a small circle; and reverse pinout with pin 1 located at the upper left hand corner and marked by a large circle and a pointing triangle. the device in normal pinout is designated by the letters ?p?in the product number, and the device in reverse pinout by the letters ?f.? in both pinouts, the mapping of pin numbers with pin names is identical. normal pinout diagram scan_tms scan_tck scan_rst vid_q 8 vid_q 9 v ss vid_q 10 vid_q 11 vid_q 12 vid_q 13 v dd vid_q 14 vid_q 15 vid_qsf vid_cke v ss v ss pass_in 0 v dd vid_clk v ss pass_in 1 v ss vid_oe hit vid_q 0 vid_q 1 v dd vid_q 2 vid_q 3 vid_q 4 vid_q 5 v ss vid_q 6 vid_q 7 scan_tdo scan_tdi v dd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 v dd palu_dq 27 palu_dq 26 palu_dq 25 palu_dq 24 v ss palu_dq 23 palu_dq 22 palu_dq 21 palu_dq 20 v dd palu_dq 19 palu_dq 18 palu_dq 17 palu_dq 16 v ss v ss pass_out v dd mclk nc v ss v ss palu_dq 15 palu_dq 14 palu_dq 13 palu_dq 12 v dd palu_dq 11 palu_dq 10 palu_dq 9 palu_dq 8 v ss palu_dq 7 palu_dq 6 palu_dq 5 palu_dq 4 v dd 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 v dd reset dram_bs 1 dram_bs 0 dram_a 8 dram_a 7 dram_a 6 dram_op 2 dram_op 1 v ss palu_a 5 palu_a 4 palu_a 3 palu_en 1 palu_we palu_op 2 v ss palu_be 3 palu_be 2 palu_dx 3 palu_dx 2 palu_dq 31 palu_dq 30 palu_dq 29 palu_dq 28 v dd 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 m1048 M5M410092Bfp v dd dram_a 0 dram_a 1 dram_a 2 dram_a 3 dram_a 4 dram_a 5 dram_en dram_op 0 v ss palu_a 0 palu_a 1 palu_a 2 palu_en 0 palu_op 0 palu_op 1 v ss palu_be 0 palu_be 1 palu_dx 0 palu_dx 1 palu_dq 0 palu_dq 1 palu_dq 2 palu_dq 3 v dd 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 dddmmmmm-nn
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 138 9 packaging reverse pinout diagram tracking label on the top surface of the 3d-ram package, a tracking label is printed below the mitsubishi logo and the 3d-ram product number. the tracking label consists of 7 numbers followed by a dash and a speed/power grade designation, and is represented by the mnemonic ?ddmmmmm-nn? this mnemonic is explained as below: ddd: date code mmmmm: manufacturing code nn: one of the following speed designations: ?0a t clk (min) = 10 ns ?0 t clk (min) = 10 ns except tclk (min) = 12 ns for the alpha saturate logic ?2 t clk (min) = 12 ns scan_tms scan_tck scan_rst vid_q 8 vid_q 9 v ss vid_q 10 vid_q 11 vid_q 12 vid_q 13 v dd vid_q 14 vid_q 15 vid_qsf vid_cke v ss v ss pass_in 0 v dd vid_clk v ss pass_in 1 v ss vid_oe hit vid_q 0 vid_q 1 v dd vid_q 2 vid_q 3 vid_q 4 vid_q 5 v ss vid_q 6 vid_q 7 scan_tdo scan_tdi v dd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 v dd palu_dq 27 palu_dq 26 palu_dq 25 palu_dq 24 v ss palu_dq 23 palu_dq 22 palu_dq 21 palu_dq 20 v dd palu_dq 19 palu_dq 18 palu_dq 17 palu_dq 16 v ss v ss pass_out v dd mclk nc v ss v ss palu_dq 15 palu_dq 14 palu_dq 13 palu_dq 12 v dd palu_dq 11 palu_dq 10 palu_dq 9 palu_dq 8 v ss palu_dq 7 palu_dq 6 palu_dq 5 palu_dq 4 v dd 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 v dd reset dram_bs 1 dram_bs 0 dram_a 8 dram_a 7 dram_a 6 dram_op 2 dram_op 1 v ss palu_a 5 palu_a 4 palu_a 3 palu_en 1 palu_we palu_op 2 v ss palu_be 3 palu_be 2 palu_dx 3 palu_dx 2 palu_dq 31 palu_dq 30 palu_dq 29 palu_dq 28 v dd 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 m1003 M5M410092Brf dddmmmmm-nn v dd dram_a 0 dram_a 1 dram_a 2 dram_a 3 dram_a 4 dram_a 5 dram_en dram_op 0 v ss palu_a 0 palu_a 1 palu_a 2 palu_en 0 palu_op 0 palu_op 1 v ss palu_be 0 palu_be 1 palu_dx 0 palu_dx 1 palu_dq 0 palu_dq 1 palu_dq 2 palu_dq 3 v dd 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 139 9 packaging mechanical drawing for 128-pin fp and rf packages figure 9.1 128-pin fp package drawing l 1 l q detail f m e m d e i 2 b 2 recommended mount pad 1 38 39 64 65 102 103 128 d h d e h e a 2 a 1 a c seating plane eb y seating plane see detail f m1041 m5m410092fp M5M410092Bfp
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 140 9 packaging figure 9.2 128-pin rf package drawing l 1 l q detail f m e m d e i 2 b 2 recommended mount pad 102 65 64 39 38 1 128 103 d h d e h e a 2 a 1 a c seating plane eb y seating plane see detail f m1042 M5M410092Brf
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 141 9 packaging table 9.1 package drawing parameters in millimeters symbol dimension in millimeters description min nominal max mounting height a 1.7 stand-off height a 1 0.05 0.15 0.25 package height a 2 1.4 terminal width b 0.13 0.18 0.28 terminal thickness c 0.105 0.125 0.175 package length d 13.9 14.0 14.1 package width e 19.9 20.0 20.1 linear spacing between terminals e 0.5 over length h d 15.8 16.0 16.2 over width h e 21.8 22.0 22.2 length of the ?t portion of terminal l 0.3 0.5 0.7 terminal length l 1 1.0 flatness of terminal y 0.1 terminal angle q 0 ?0 mount pad dimensions b 2 0.225 mount pad dimensions i 2 1.0 mount pad dimensions m d 14.4 mount pad dimensions m e 20.4
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 142 9 packaging thermal characteristics the maximum junction temperature (t jc ) is set to 125 c. the junction temperature can be calculated with the following equation: t jc ( c) = q ja ( c/w) x p (w) + t a ( c) where q ja is the junction-to-ambient thermal resistance, p is the whole chip power dissipation, and t a is the ambient temperature. thermal resistance for single package four cases of the thermal resistance for single package are listed in table 9.2. these four cases are package only, package on pcb, package on pcb with thermal compound, and package on pcb with fin. figures 9.3 through 9.7 show the detailed conditions of these four cases. figure 9.3 case 1 condition: package only figure 9.4 case 2 condition: package on pcb figure 9.5 case 3 condition: package on pcb with thermal compound figure 9.6 case 4 condition: package on pcb with ? table 9.2 thermal resistance for single package air?w case 1 package only q ja ( c/w) case 2 package on pcb q ja ( c/w) case 3 package on pcb with compound q ja ( c/w) case 4 package on pcb with ? q ja ( c/w) 0; or natural convection 160.0 86.0 80.0 44.4 100 ft/min or 0.5 m/s 97.0 66.2 60.3 31.2 200 ft/min or 1.0 m/s 78.0 58.7 53.9 26.3 400 ft/min or 2.0 m/s 59.5 49.4 43.6 20.9 1000 ft/min or 5.0 m/s 43.1 39.5 35.1 16.2 package (adiabatic plane) package (adiabatic plane) pcb package (adiabatic plane) pcb thermal compound package (adiabatic plane) pcb 9.2 mm fin
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 143 9 packaging figure 9.7 mechanical drawing of the ? f 13 f 6 1.5 1.5 1.0 9.2 2.0 7.5
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 144 9 packaging thermal resistance for twelve packages mounted on pcb table 9.3 lists the thermal resistance for 12 packages mounted on two sides of a pcb. two cases are listed: without heat sink and with aluminum plate for heat sink. figure 9.8 multiple packages double-side mounted on pcb, without heat sink figure 9.9 multiple packages double-side mounted on pcb, with heat sink table 9.3 thermal resistance for 12 packages mounting double-sided on pcb air?w without heat sink q ja ( c/w) with heat sink q ja ( c/w) 0 m/s or natural convection 84.0 41.0 100 ft/min or 0.5 m/s 73.0 26.1 200 ft/min or 1.0 m/s 59.2 21.5 400 ft/min or 2.0 m/s 43.8 17.0 1000 ft/min or 5.0 m/s 30.7 13.3 solder pcb fe solder pcb fe resin al plate (40 x 20 x 0.6 mm)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 145 9 packaging figure 9.10 mechanical drawing of the mounting, without heat sink 76.2 10.16 air flow 1.6 pcb (glass epoxy)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 146 9 packaging figure 9.11 mechanical drawing of the mounting, with heat sink 76.2 10.16 air flow 20 40 1.6 0.6 max 2.5 resin al
jtag boundary scan 10

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 147 10 jtag boundary scan jtag boundary scan boundary-scan architecture the 3d-ram provides test features that are partially in compliance with the ieee standard 1149.1 test access port and boundary-scan architecture. the on-chip test logic provides a standardized approach for checking the interconnections between different components on the same printed circuit board. the boundary-scan test logic consists of the boundary-scan register and support logic. the test function is accessed through the test access port (tap). the tap provides a simple serial interface that allows testing of all signal traces with only five pins in the serial test port. inside the 3d-ram, the boundary-scan cells for the signal pins are interconnected to form a shift- register chain around the pads. this path has serial input and output connections with scan clock and control signals. on a printed circuit board, the boundary-scan registers for the individual components can be connected in series to form a single path through the whole board, as illustrated in figure 10.1. alternatively, a board design could contain several independent boundary-scan paths. figure 10.1 a boundary-scannable board design boundary-scan cell serial test interconnect system interconnect serial data in serial data out
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 148 10 jtag boundary scan the boundary-scan test logic contains the following elements: ? test access port (tap), consisting of input pins scan_tms, scan_tck, scan_rst and scan_tdi, and an output pin scan_tdo ? tap controller, which interprets the inputs on the test mode select line (scan_tms) and performs the corresponding opera- tions, such as controlling the scan instruc- tion and data registers within the 3d-ram ? instruction register (ir), which accepts instruction codes shifted through the test data input (scan_tdi) pin ? two test data registers: bypass register (bpr) and boundary-scan register (bsr) the instruction and test data registers are separate shift-register paths connected in parallel and have a common serial data input (scan_tdi) and a common serial data output (scan_tdo). the data flow is controlled by the tap controller signals. a block diagram of the boundary-scan architecture is shown in figure 10.2. figure 10.2 block diagram of the boundary scan architecture scan_tdo boundary scan register bypass register decode instruction register clocks and/or controls tap controller scan_tdi scan_tms scan_tck scan_rst m1002
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 149 10 jtag boundary scan the tap controller the tap controller is a synchronous finite state machine controlling the sequence of test logic operations. the tap controller changes state at the rising edge of the scan_tck pin. the scan_tms pin controls the sequence of the state changes. a state diagram for the tap controller is shown in figure 10.3. the tap controller is initialized either after power up or when scan_rst is asserted low. in addition, it can be initialized by applying a high signal level on the scan_tms input for five scan_tck cycles. figure 10.3 tap controller state diagram test-logic- reset 0 1 run-test/ idle 0 select- dr-scan 0 1 capture-dr 0 shift-dr 1 0 exit1-dr 0 1 pause-dr 1 0 exit2-dr 1 update-dr 0 1 1 1 0 select- ir-scan 0 1 capture-ir 0 shift-ir 1 0 exit1-ir 0 1 pause-ir 1 0 exit2-ir 1 update-ir 0 1 1 0
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 150 10 jtag boundary scan test-logic-reset state the instruction register is set to the default bypass instruction, so that normal 3d-ram operations can proceed without interference. the tap controller enters this state when it is initialized after power-up or by the reset signal scan_rst . regardless of the original state, the controller enters this state when the scan_tms input is held high for at least five rising scan_tck cycles. run-test/idle state this is an idle state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. select-dr-scan state this is a temporary controller state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. capture-dr state the boundary-scan register captures input data from the scan_tdi pin if the current instruction is extest or sample/preload. the bypass register does not change. the current instruction does not change in this state. shift-dr state the test data register selected by the current instruction shifts data one stage toward scan_tdo on each rising edge of scan_tck. the current instruction does not change in this state. exit1-dr state this is a temporary controller state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. pause-dr state the pause-dr state allows the data shifting through the test data register to be temporarily halted. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. exit2-dr state this is a temporary controller state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. update-dr state the boundary-scan cell for output pad has a latch to prevent changes at the parallel output while data is shifting along the boundary-scan chain. when the tap controller is in this state and the boundary-scan register is selected, data is latched from the shift-register path on the falling edge of scan_tck. the data held at the latch does not change other than in this state. all shift- register stages in selected test data register retain their previous values during this state. the current instruction does not change in this state. select-ir-scan state this is a temporary controller state. the test data register selected by the current instruction retains its previous state. the current instruction does not change in this state. capture-ir state the shift-register contained in the instruction register is loaded with the fixed value ?001?on the rising edge of scan_tck. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. 10 jtag boundary scan
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 151 10 jtag boundary scan shift-ir state the shift-register contained in the instruction register is connected between the scan_tdi and scan_tdo pins. the shift-register shifts data one stage towards its serial output on each rising edge of scan_tck. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. exit1-ir state this is a temporary state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. pause-ir state the pause-ir state allows the data shifting through the instruction register to be temporarily halted. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. exit2-ir state this is a temporary state. the test data register selected by the current instruction retains its previous value during this state. the current instruction does not change in this state. update-ir state the instruction shifted into the instruction register is latched from the shift-register path on the falling edge of scan_tck. the test data register selected by the current instruction retains its previous value during this state. once the new instruction has been latched, it becomes the current instruction. test data register the 3d-ram contains the two required test data registers: bypass register and boundary-scan register. both registers are connected to the scan_tdi and scan_tdo pins. when a register is selected by the current instruction, the data in its shift-register is shifted one stage towards the scan_tdo output pin on each rising edge of the scan_tck pin. bypass register the bypass register is a one-bit shift-register that provides the minimal length path between the scan_tdi and scan_tdo pins. when the 3d-ram is not required to perform scan test operation, this path can be selected to allow rapid movement of test data to and from other components on the board. boundary-scan register the boundary-scan register is a shift-register path containing the boundary-scan cells that are connected to all input/output signal pins of the 3d-ram except the following pins: mclk, pass_in [1:0] , pass_out, and vid_clk. the boundary-scan cells of the palu_dq [31:0] pins are implemented as input pins only. figure 10.4 shows the logical structure of the boundary-scan register. while output cells determine the value of the signal driven on the corresponding pin, input cells only capture data. the output cell has a latch connected to the shift-register for latching the data in update-dr state. these operations do not affect the normal operations of the device. data is shifted from the scan_tdi pin to the scan_tdo pin through the boundary-scan register during scanning. the boundary-scan register can be operated by the extest and sample/preload instructions.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 152 10 jtag boundary scan figure 10.4 logical structure of the boundary-scan register b/s cell reset, palu_dq, palu_dx, palu_a, palu_op, palu_be, palu_en, dram_a, dram_bs, d arm_op, dram_en, vid_cke b/s cell vid_oe b/s cell b/s cell vid_q b/s cell vid_q hit 3d-ram system logic m1001 b/s cell vid_qs f
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 153 10 jtag boundary scan the boundary-scan cells inside the boundary-scan register are organized in the following order: scan_tdi ? dram_a 0 ? dram_a 1 ? dram_a 2 ? dram_a 3 ? dram_a 4 ? dram_a 5 ? dram_en ? dram_op 0 ? palu_a 0 ? palu_a 1 ? palu_a 2 ? palu_en 0 ? palu_op 0 ? palu_op 1 ? palu_be 0 ? palu_be 1 ? palu_dx 0 ? palu_dx 1 ? palu_dq 0 ? palu_dq 1 ? palu_dq 2 ? palu_dq 3 ? palu_dq 4 ? palu_dq 5 ? palu_dq 6 ? palu_dq 7 ? palu_dq 8 ? palu_dq 9 ? palu_dq 10 ? palu_dq 11 ? palu_dq 12 ? palu_dq 13 ? palu_dq 14 ? palu_dq 15 ? palu_dq 16 ? palu_dq 17 ? palu_dq 18 ? palu_dq 19 ? palu_dq 20 ? palu_dq 21 ? palu_dq 22 ? palu_dq 23 ? palu_dq 24 ? palu_dq 25 ? palu_dq 26 ? palu_dq 27 ? palu_dq 28 ? palu_dq 29 ? palu_dq 30 ? palu_dq 31 ? palu_dx 2 ? palu_dx 3 ? palu_be 2 ? palu_be 3 ? palu_op 2 ? palu_we ? palu_en 1 ? palu_a 3 ? palu_a 4 ? palu_a 5 ? dram_op 1 ? dram_op 2 ? dram_a 6 ? dram_a 7 ? dram_a 8 ? dram_bs 0 ? dram_bs 1 ? reset ? vid_q 8 ? vid_q 9 ? vid_q 10 ? vid_q 11 ? vid_q 12 ? vid_q 13 ? vid_q 14 ? vid_q 15 ? vid_qsf ? vid_cke ? vid_oe ? hit ? vid_q 0 ? vid_q 1 ? vid_q 2 ? vid_q 3 ? vid_q 4 ? vid_q 5 ? vid_q 6 ? vid_q 7 ? scan_tdo instruction register the instruction register (ir) allows instructions to be serially shifted into the 3d-ram through the scan_tdi pin. the instruction selects the particular test to be performed, the test data register to be accessed, or both. the instruction register is a four-bit wide shift-register with a parallel latch. the most significant bit is connected to the scan_tdi pin and the least significant bit is connected to the scan_tdo pin. on entering the capture-ir controller state, the instruction register is loaded with the default instruction ?001? which is the bypass instruction. instructions are shifted into the instruction register on the rising edge of the scan_tck pin while the tap controller is in the shift-ir state. the 3d-ram supports all three mandatory boundary-scan instructions, namely bypass, sample/preload, and extest. table 10.1 lists the 3d-ram boundary-scan instruction codes.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 154 10 jtag boundary scan table 10.1 boundary-scan instruction codes bypass instruction the instruction codes for the bypass instruction are any codes except ?000?(for extest) and ?100?(for sample/preload). the bypass instruction selects the bypass register to be connected to the scan_tdi or scan_tdo pin. the bypass register contains a single shift- register stage and is used to provide a minimum length serial path between the scan_tdi and the scan_tdo pins when no scan test operation of the 3d-ram is required. this allows more rapid movement of test data to and from other components on the board. due to the pull-up resistor on the scan_tdi input, an open circuit fault in the board level test data path will cause the bypass register to be selected following an instruction scan cycle. this was done to prevent any unwanted interference with the normal operation of the 3d-ram. sample/preload instruction the instruction code is ?100? the sample/ preload instruction allows the scanning of the boundary-scan register without interference to the normal operation of the 3d-ram. as suggested by the instruction name, the sample/ preload instruction can be used to perform two functions: ? sample is performed in the capture-dr controller state. all signals received at the 3d-ram input pins are loaded into the boundary-scan register on the rising edge of the scan_tck pin. ? preload is performed in the update-dr controller state. the data held in the shift- register stage of the output cell is latched on the falling edge of the scan_tck pin. extest instruction the instruction code is ?000? the extest instruction allows testing of board interconnections. the extest instruction selects the boundary-scan register to be connected between the scan_tdi and scan_tdo pins. two functions are performed when the extest instruction is selected: ? in the capture-dr controller state, all sig- nals received at the 3d-ram input pins are loaded into the boundary-scan register on the rising edge of the scan_tck pin. this is equivalent to the sample operation in the sample/preload instruction. ? in the update-dr controller state, the data held in the shift-register stage of the output cell is latched and driven to the 3d-ram output pins, on the falling edge of the scan_tck. instruction code instruction name 0000 extest 0001 bypass 0010 bypass 0011 bypass 0100 sample/preload 0101 bypass 0110 bypass 0111 bypass 1000 bypass 1001 bypass 1010 bypass 1011 bypass 1100 bypass 1101 bypass 1110 bypass 1111 bypass
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 155 10 jtag boundary scan vid_oe boundary-scan cell the vid_oe pin controls the tri-state buffer of the vid_q bus. therefore, its boundary-scan cell configuration is different from a normal input pin. the functions performed on this cell for the sample/preload and extest instructions are summarized below. ? in the capture-dr controller state, the sig- nal received at the vid_oe pin is loaded into the shift-register on the rising edge of the scan_tck pin. ? in the update-dr controller state, the data held in the shift-register is latched on the falling edge of the scan_tck pin. if the instruction is extest, the latched vid_oe data will control the tri-state buffer of the vid_q bus.
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 156 10 jtag boundary scan 10 jtag boundary scan
formal specitcation of operations 11

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 157 11 formal speci?ation of operations formal speci?ation of operations this chapter specifies exactly which bits are moved for many types of operations. it uses a syntax derived from c and verilog to specify exactly which bits are copied for each operation. elements bit da[4][257][10240] dram array bit sa[4][10240] sense amplifiers bit ral[4][9] row address latch bit vb[2][640] video buffer bit vd[16] video data pins bit vc[7] video counter bit vm[1] video mode bit sram[8][256] static ram bit dt[8][32] dirty tag bit pm[32] plane mask register bit dq[32] pixel alu data pins bit be[4] byte enable pins bit daddr[9] dram address bit paddr[6] pixel alu address
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 158 11 formal speci?ation of operations bit ordering of elements figure 11.1 bit orderings of several elements 0 8 16 24 0 8 16 24 0123 08 0 8 16 24 1 9 17 25 2 101826 3111927 4 122028 5 132129 6 142230 7 152331 0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 136 144 152 160 168 176 184 192 200 208 216 224 232 240 248 0816 640 648 656 616 624 632 1256 1264 1272 9576 9584 9592 102161022410232 8960 8968 8976 9600 9608 9616 616 624 632 0816 palu_be pins block palu_dq pins plane mask dirty tags page video buffer vid_q pins
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 159 11 formal speci?ation of operations figure 11.2 orderings of words and blocks in a page access page access page(bit bank[2], bit daddr[9]) { bit i[4]; for(i = 0; i < 9; i++) ral[bank][i] <- daddr[i]; bit j[14]; for(j = 0;j < 10240; j++) sa[bank][j] <- da[bank][daddr][j]; } duplicate page duplicate page(bit bank[2], bit daddr[9]) { bit i[4]; for(i = 0; i < 9; i++) ral[bank][i] <- daddr[i]; bit j[14]; for(j = 0;j < 10240; j++) da[bank][daddr][j] <- sa[bank][j]; } 17 8 162432 4 blocks within a page 5 6 1 2 37 9 10 11 12 13 14 15 0 18 19 20 21 22 23 25 26 27 31 28 29 30 33 34 35 36 37 38 39 0 1 2 3 4 5 6 7 words within a block
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 160 11 formal speci?ation of operations precharge bank precharge(bit bank[2]) {} read block read block(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) sram[daddr[8..6]][i] <- sa[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]]; bit j[5]; for(j = 0;j < 32; j++) dt[daddr[8..6]][j] <- 0; } masked write block masked write block(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) if(pm[i[4..0]] && dt[daddr[8..6]][{i[4..3],i[7..5]}]) { sa[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <- sram[daddr[8..6]][i]; da[bank][ral][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <- sram[daddr[8..6]][i]; } }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 161 11 formal speci?ation of operations unmasked write block unmasked write block(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) if(dt[daddr[8..6]][{i[4..3],i[7..5]}]) { sa[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <- sram[daddr[8..6]][i]; da[bank][ral][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <- sram[daddr[8..6]][i]; } } video transfer video transfer(bit bank[2], bit daddr[9]) { bit i[10]; for(i = 0; i < 640; i++) vb[bank[0]][i] <- sa[bank][640*daddr[3..0] + i]; if(daddr[8]) { vc[5..0] <- 0; vc[6] <- bank[0]; vm[0] <- daddr[7]; } }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 162 11 formal speci?ation of operations video cycle video cycle(bit enable[1], bit voe[1]) { if(voe) { bit i[4]; for(i = 0; i < 16; i++) vd[i] <- vb[vc[6]][{vc[5..1],(vc[0]^vm[0])}*16 + i]; } if(enable) { if(vc[5..0] == 39) { vc[5..0] <- 0; vc[6] <- ~vc[6]; } else { vc[5..0] <- vc[5..0] + 1; } } } data read data read(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]]) dq[i] <- sram[paddr [5..3]][paddr[2..0]*32 + i]; }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 163 11 formal speci?ation of operations stateless initial data write stateless initial data write(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]]) sram[paddr[5..3]][paddr[2..0]*32 + i] <- dq[i]; bit j[5]; for(j = 0;j < 32; j++) if (cds[0] == 0) /* (8,8,8,8) normal mode */ if((paddr[2..0] == j[2..0]) && be[j[4..3]]) dt[paddr[5..3]][j] <- 1; else dt[paddr[5..3]][j] <- 0; else /* (4,4,4,4) 16-bit color mode */ if(((be[3] || be[2]) == 1) && ((be[1] || be[0]) == 1)) error(?llegal byte enable combination?; else if((paddr[2..0] == j[2..0]) && j[4]) dt[paddr[5..3]][j] <- (be[3] || be[1]); else if((paddr[2..0] == j[2..0] && !j[4]) dt[paddr[5..3]][j] <- (be[2] || be[0]); else dt[paddr[5..3]][j] <- 0;) }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 164 11 formal speci?ation of operations stateless normal data write stateless normal data write(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]]) sram[paddr[5..3]][paddr[2..0]*32 + i] <- dq[i]; bit j[5]; for(j = 0; j < 32; j++) if (cds[0] == 0) /* (8,8,8,8) normal mode */ if((paddr[2..0] == j[2..0]) && be[j[4..3]]) dt[paddr[5..3]][j] <- 1; else /* (4,4,4,4) 16-bit color mode */ if(((be[3] || be[2]) == 1) && ((be[1] || be[0]) == 1)) error(?llegal byte enable combination?; else if((paddr[2..0] == j[2..0]) && j[4]) dt[paddr[5..3]][j] <- (be[3] || be[1]); else if((paddr[2..0] == j[2..0] && !j[4]) dt[paddr[5..3]][j] <- (be[2] || be[0]); } replace dirty tag replace dirty tag(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]]) if (i == paddr[2..0] + 8*i[4..3]) dt[paddr[5..3]][i] <- dq[i]; }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 165 11 formal speci?ation of operations or dirty tag or dirty tag(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]] && dq[i]) if (i == paddr[2..0] + 8*i[4..3]) dt[paddr[5..3]][i] <- dq[i]; } write plane mask register write plane mask register() { bit i[5]; for(i = 0; i < 32; i++) if(be[i[4..3]]) pm[i] <- dq[i]; }
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 166 11 formal speci?ation of operations
appendix a 12

electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 167 12 appendix a appendix a glossary some of the terms used in this document may be unfamiliar to the reader, and some may have a specific meaning in the context of this document. for convenience, these terms are collected here and provided with a brief definition and a reference to the paragraphs where the terms are explained in greater detail. this glossary list is not intended to be exhaustive 3d-ram an innovative 10-mbit cached dual-port cmos memory device that dramatically improves the performance of a three-dimensional computer graphics system with on-chip support for z-buffer hidden surface removal algorithm and for full blending and logical raster operations, achieving a peak bandwidth of 14.6 gbytes/s and a sustained bandwidth of 400 mbytes/s (for the -10 speed grade). blending a computer graphics operation for simulating the visual effect of overlapping objects with the foreground objects being partially transparent. an example of blending equations is that overall color = (a) x (color of foreground object) + (1 - a) x (color of background object), where a is the percentage of light transmitted through the medium of the foreground object. each of the four 8-bit blend units in the pixel alu can perform one of the two multiplications and then the addition, provided that the product of the other multiplication is supplied by the rendering controller. (pages 10 and 26) block a unit of memory organized into eight 32-bit words. this is the unit of data movement between a dram bank and the pixel buffer. (pages 4 and 21) byte a unit of memory containing 8 bits of data. this is the unit of data operation for the four blend units in the pixel alu. the rendering controller can enable or disable the writing (to 3d-ram) or reading (from 3d-ram) of the individual bytes in a word. (pages 10, 15, and 26) color buffer the collection of memory that contains all color bits of all pixels to be displayed on the screen. because the alpha information is needed for blending operations in 3d-ram, the alpha data should also be stored together with the color data. it is also popular to have overlay information stored together with the color information to allow fast display of 2d objects by data multiplexing in a radmac chip. (chapter 6) dirty tag a 32-bit memory in the pixel buffer, indicating which of the 32 bytes in the corresponding 256-bit block in the sram cache have been updated by the pixel alu since the data was transferred from the dram array. a ??in a bit of dirty tag indicates the corresponding byte in the sram cache is newer than the data in the dram array. there are eight such dirty tags in the pixel buffer. (page 22)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 168 12 appendix a in 2d rendering, this feature may also be used for color expansion from a bit to a byte to accelerate drawing of many pixels of the same color. two pixel alu operations, namely ?eplace dirty tags?and ?r dirty tags,?may be used to facilitate this application of the dirty tags. (page 23) in the (4, 4, 4, 4) 16-bit color mode, the setting of the dirty tag bits is same as the (8, 8, 8, 8) 32-bit color mode in the sense that if a byte of data is updated, then the corresponding dirty tag bit is set. however, since the palu_be [3:0] pins have different meanings in the (4, 4, 4, 4) 16-bit color mode from the (8, 8, 8, 8) 32-bit color mode, the specific dirty tag bits that are set are different in the two color modes. the mapping for the ?eplace dirty tag?and ?r dirty tag?are the same in both modes. (page 47) double buffer two color buffers of identical size, with one of them shifting out video data toward the display screen (usually through a ramdac chip), while the other being updated with new pixel data by the rendering controller. (pages 99 and 101) dram bank one of the four 2.5-mbit dram banks in a 3d-ram chip. each banks has 10,240 sense amplifiers, which function as a level-two pixel cache, and 257 pages of 10,240 bits each. all four dram banks are connected with a common 256-bit global bus to interface with the pixel buffer. (page 6) frame buffer a collection of memory that contain all bits of data for all pixels in the display screen and, in some systems where the frame buffer is large enough to hold more than the pixel data for the specific display resolutions, extra pixel data temporarily stored outside the memory area for display pixels. a pixel data may include bits for the color value for each of the r, g, and b color component, bits for the z (or depth) value, bits for window id, and bits for some other auxiliary functions, such as overlay; only the color bits are actually displayed on the screen, while the other bits are stored with the color bits for faster graphic processing. (chapter 6) global bus a 50-mhz (for the -10 speed grade) 256-bit data bus connecting between the four dram banks and the pixel buffer. (page 8) magitude compare pixel alu operation that compares the incoming 32-bit data with the 32-bit data stored in the frame buffer. each bit of the 32-bit magnitude comparison may be masked by setting a ??in the corresponding bit of the magnitude mask register. the result of one of eight possible magnitude comparison tests can govern whether the new data is written into the frame buffer. most commonly, as part of the z-buffer hidden surface removal algorithm, the magnitude comparison tests are performed on the z value of the existing pixel against that of a new pixel intended for the same screen location. (pages 10, 51 and 61) match compare a pixel alu operation that compares the incoming 32-bit data with the 32-bit data stored in the frame buffer. each bit of the 32-bit match comparison may be masked by setting a ??in the corresponding bit of the match mask register. the result of one of four possible match comparison tests can govern whether the new data is written into the frame buffer. (pages 10, 51 and 61) page a unit of memory contains forty 256-bit blocks. three dram operations (pre, acp, and dup) operate on the entire 10,240-bit page in one command. (pages 4 and 75)
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 169 12 appendix a picking a computer graphics operation to select a drawn object on the display screen. a familiar example in 2d graphics is selecting an icon by clicking a mouse button. in 3d graphics, the selection is complicated by the z values of objects and the size and locations (x, y, z) of the 3d selection cursor, and is supported by the on-chip picking logic. (pages 12 and 55) picking logic a pixel alu function that provides to the rendering controller a hit flag, indicating if a pixel falls within the 3d selection cursor. (pages 12 and 55) pixel alu an on-chip 100-mhz (for the -10 speed grade) processing unit with a 7-stage pipeline that performs destination blending/logical raster operation, magnitude comparison, and match comparison all in parallel, thus converting the read-modify-write interface with the rendering controller to a write-mostly one. (pages 9, 25 and 56) pixel buffer the triple-port 2,048-bit sram as a level-one pixel cache in 3d-ram with two 100-mhz (for the -10 speed grade) 32-bit buses interfacing with the pixel alu and a 50-mhz (for the -10 speed grade) 256-bit bus interfacing with the dram arrays. also included in the pixel buffer are eight 32-bit dirty tags and a 32-bit plane mask. (pages 7 and 21) plane mask a 32-bit register that affects both the stateful data writes to the pixel buffer and the masked block writes (mbw) to the dram arrays. the effect is simultaneous on both types of operations when they are perfomed concurrently by the pixel alu port and the dram port, respectively. with respect to the pixel buffer, the plane mask facilitates the ?rite-per-bit?function on the 32-bit resultant of the rop/blend units. for mbw, the effect of the plane mask on the (32-bit) word 0 is simply extended to words 1 through 7. (pages 23 and 58) rop an abbreviation for raster operation. there is a total of sixteen standard logical raster operations, including and, or, and xor. (pages 10, 26 and 60) stateful data write a pixel data operation of the pixel alu. the word ?tateful?refers to the controls of the dual compare units and the rop/blend units of the pixel alu on whether and what data is to be written into the pixel buffer. there are two types of stateful data write: namely, stateful initial data write and stateful normal data write. (page 66) stateless data write a pixel data operation of the pixel alu. the word ?tateless?refers to the straight pass-through of pixel data from the 3d-ram inputs to the pixel buffer, with the data write unaffected by the rop/ blend units and the dual compare units of the pixel alu. there are two types of stateless data write: namely, stateless initial data write and stateless normal data write. (page 72) stencil or stenciling stenciling applies a test that compares a reference value with the value stored at a pixel in the stencil buffer and then performs two tasks based on the results of this stencil test and the depth test: (1) operates on the update of the stencil data and (2) controls the write enable of the color buffer and the depth buffer. the most common use of stencil function is to generate an irregular shaped region of some desirable color pattern, such as a decal. stencil may also be applied to produce stipples or screened door patterns which are sometimes employed to achieve transparency effect without resorting to the expensive multipliers and adders in the
electronic device group m itsubishi rev. 1.03 3d-ram (M5M410092B) 170 12 appendix a blending function. stencil is also useful in hidden surface removal. (pages 40 and 66) video buffer one of the two serial access video buffers of 40x16 bits, which alternates every forty vclk cycles to shift video data out. one video buffer is connected with two dram banks, and all 640 bits of a video buffer are loaded by a single dram command (vdx). the video data output rate is 16 bits per 14-ns cycle. (pages 7 and 78) word a unit of memory representing four bytes. this is the unit of data movement between the pixel alu and the pixel buffer. (pages 4 and 21) z buffer the collection of memory that contains all z values of all pixels to be displayed on the screen. (pages 99 and 101)


▲Up To Search▲   

 
Price & Availability of M5M410092B

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X